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Preface 


The Shape of the Book 


This is a book on the history of mathematics; its basic dynamic is historical and there- 
fore, up to a point, chronological. It follows the progress of a number of ideas that 
grew, sometimes came together, and often developed rich and fascinating branches 
and applications. At its core is an account of how the calculus of Newton and 
Leibniz—the calculus of functions of a single variable—led to attempts to develop a 
calculus of functions of several variable and how these new mathematical methods 
contributed to the study, first of ordinary, and then of partial differential equations. 
In each case, the rationale for that work was chiefly to develop general methods that 
could tackle problems in geometry and mechanics (the motions of solids and liquids 
under the action of forces). 

The physical world being a complicated place, most of the applications involved 
partial differential equations, and here the story soon also became complicated. The 
first-order partial differential equation in two independent variables was initially diffi- 
cult to solve, and this posed problems for the study of more than two independent 
variables and for equations of higher order. Important work on the first-order case 
was done by Lagrange and Monge before Cauchy was finally able to show that such 
equations almost always have a solution. But the second-order case almost immedi- 
ately confined itself to three special cases, somewhat as Euler had suggested, and all 
of them, as we would say, linear. The first, and simplest, is the wave equation (the 
prototype hyperbolic equation), successfully tackled by d’ Alembert. Euler regarded 
the one later known as the elliptic case (the key example being the Laplace equation) 
as being beyond current methods. Finally, the case we call parabolic fell through a 
gap in his approach, and strangely little was said about it before Fourier dealt with 
the canonical example: the heat equation. At this point, a significant departure from 
the theory of ordinary differential equations opened up: the need to pay attention to 
initial or boundary conditions. However, this issue was to remain obscure for several 
decades. 
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Euler quickly showed that linear ordinary differential equations with constant 
coefficients can be solved systematically. Other types of ordinary differential equa- 
tions were studied in the eighteenth century, but the story is piecemeal, and instead, 
I chose to give just one example of the history of ordinary differential equations: the 
hypergeometric equation from Gauss to Riemann, Schwarz, and Poincaré. This is 
one of the glories of the subject, bringing together early ideas about group theory, 
complex function theory, and the then-novel hyperbolic or non-Euclidean geometry. 

So how is all this material organised in this book? Chapter | connects the calculus 
to problems in ordinary differential equations and is mirrored by Chaps. 3 and 5 in 
which the calculus of several variables is developed and the first partial differential 
equations are studied. Then it is a fairly straight run through topics in partial differen- 
tial equation theory in Chaps. 6, 8, 10, 13, 17—20. This allows us to see how the work 
of Euler, d’ Alembert, and a few others rewrote Newton’s Principia Mathematica for 
the eighteenth-twentieth centuries. The story of the hypergeometric equation occu- 
pies Chaps. 11 and 14—16 because it must start in 1812 with Gauss and because once 
it gets going it seems ridiculous to break it up. What intervenes here is Chap. 12 on 
Cauchy’s demonstration of the existence of solutions to ordinary differential equa- 
tions and Chap. 13 on Riemann’s geometric version of complex function theory, 
which is needed for the subsequent three chapters. 

What of the chapters not yet referred to? Chapter 2 describes the start of the 
calculus of variations, and Chap. 7 takes that subject further into the eighteenth 
century. Chapter 4 documents other successes of the partial differential calculus in 
studying natural phenomena other than the wave equation. (There is also a surprising 
link to the hypergeometric equation.) Chapters 9 and 21 are opportunities for revision; 
when I gave the course I used these lectures to discuss the assessment on the course 
so far. 

The remaining chapters move into what may be less familiar material. Riemann’s 
study of shock waves; Riemann and Weierstrass on minimal surfaces; the work of 
Thomson and Stokes on the telegraphist’s equation and the laying of the trans-Atlantic 
cable; a look at the first ninieteenth-century attempts to rigorise the calculus of vari- 
ations; the eventual introduction of the fundamental trichotomy (elliptic, parabolic, 
hyperbolic) for second-order linear partial differential equations and the first general 
existence theorems in the elliptic and hyperbolic cases including Hadamard’s insis- 
tence of the distinction between initial and boundary value problems. Two chapters 
look at how Jacobi used Hamilton’s ideas to create Hamilton-Jacobi theory and 
subsequent attempts to geometrise mechanics, and the connection to the solution of 
first-order partial differential equations. 

All this material has a certain coherence that is worth spelling out. Ordinary differ- 
ential equations grew out of, or alongside, problems in evaluating integrals, which is 
why we still talk, confusingly, of integrating a differential equation and its solutions 
as its integrals. It was soon recognised that the solution to an ordinary differential 
equation was a family of functions and an individual solution could be specified 
by means of some initial conditions. So, it was natural when differential equations 
with several independent variables were investigated that the earliest researchers 
(Jean le Rond d’ Alembert, Leonhard Euler, Pierre Simon Laplace, and Joseph-Louis 
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Lagrange) thought of these partial differential equations in the same way, and looked 
for techniques that would produce a formula for the general solution (however, they 
seldom also discussed an auxiliary process of fitting the general solution to some 
initial conditions). Part of the story here is the gradual recognition that this is not the 
right way to think of partial differential equations. Rather, it is a dialogue between 
the general methods and the initial or boundary conditions that is central, and which 
underpins the crucial distinction between the elliptic and hyperbolic types to which 
formal methods are blind. As we shall see, this explains the problematic way in which 
complex variables were first used. 

It is also interesting to see how questions of rigour enter the story in a way that 
is immediately important and does not appear as the whim of an analytic pedant. 
The ad hoc methods that can be used to solve a partial differential equation (such 
as the separation of variables) naturally raise the question of the uniqueness of the 
solutions that is important in applications. The need for power series to converge 
forces a heavy reliance on (real or complex) analytic methods that has, ultimately, to 
be outflanked. 


Advice to Students 


There are several important things being described in this course, and it may help to 
remember what they are before immersing yourself in the technical details. Newton’s 
work is a case in point. Even though we shall only skim its surface, it is clear that 
this is a remarkable achievement, one that it took more than a century to confirm, 
and some of the best work of the twentieth century to surpass. Newton’s account 
of the motion of the Moon and the planets does not rest on the calculus, still less 
differential equations, but everyone after him turned to the calculus, and Euler gave 
everyone the means to write celestial mechanics that way ever after. Mathematical 
accounts of fundamental physical processes—gravity, the production and spread of 
sound, the propagation of heat and of electric signals—are among the successes of 
the theory of partial differential equations. 

But for the calculus to do this work, mathematicians have to have the confidence 
that it does work. This is partly a matter of rigour, and indeed it is satisfying to see 
that so many of the questions that are dealt with in courses on pure analysis arose 
in contexts where a practical, or at least a physical, answer depended on the quality 
of the reasoning. Less obviously, but perhaps more interestingly, it is worth seeing 
what even the best mathematicians did with difficult problems. Arguments have to 
be rigorous—ultimately. Before then they have to be some or all of convincing, intel- 
ligible, general, plausible, and applicable. Likewise, solutions have to be a number 
of things apart, ideally, from being right: among the criteria are, on one occasion or 
another, computable, accurate enough, intelligible, complete, and unique. Just as an 
argument might get to the heart of a problem or somehow merely work, a solution 
can be truly informative or merely a formula. All of this is on display here. And, of 
course, sometimes an equation with no answer in sight can still seem to be a valuable 
advance. 


Vili Preface 


In this sense, the most momentous event on display here in the first half of the 
course is that the calculus, in the form of differential equations, both ordinary and 
partial, can deliver so much insight. The comparable change in the second half of the 
course, as I hinted above, is the transformation in what a solution is taken to be. For 
partial differential equations this is the rise to equal importance with the equation of 
the boundary or initial conditions, coupled as it is with a profound classification of 
these equations into types. The need for rigour played its part in these developments 
when appeals to the so-called generality of analysis and its supposed algebraic or 
formal basis began to fail. 

This is not a set of lectures in which epsilons and deltas, ns and Ns dance ever more 
intricately, but this should not suggest that when mathematics is applied—whatever 
that might mean—standards drop. Mathematicians were doing their best at all times 
to get it right, although we can observe different ways in which they honoured 
that commandment. The difficult mathematics here comes from the difficulty of the 
problems: a partial differential equation is a difficult thing to understand, harder than 
an ordinary differential equation, and harder than many an early investigator realised. 
Qualitative arguments are often harder than quantitative ones, if less technical. 

The challenge you face is to get a sense of that struggle, of the difficulty, and how 
it was tackled. 

Being a historian of mathematics means attending to mathematics on its own 
terms as well as ours, and seeing it in the context of its time. What was known, what 
was thought to be true? When a mathematician tackles a problem, you ask: How had 
other problems like this one been tackled, how were they tackled after this one? Is 
the analysis of the problem convincing, is the solution informative? What, in the end, 
were these people trying to do? 


Historiographical Remarks 


There are several existing accounts of the history of calculus, and a number of 
specialist books and articles on particular aspects of that history. The contributions 
of Newton and Leibniz, Euler, Lagrange, Fourier, Cauchy, Riemann, Weierstrass, 
Poincaré, and Hadamard have been studied in some depth; various topics, such as the 
wave equation, the heat equation, Laplace’s equation, and the Dirichlet problem have 
been looked at in some detail, although not always after the original breakthroughs 
were made. But there is no general history of differential equations, ordinary or 
partial. Histories of the calculus dwell on the story of the rigorisation of the calculus 
and the creation of modern (or, rather, nineteenth century) mathematical analysis, 
but tend to marginalise the story of what made the calculus valuable: the capacity 
it gives mathematicians and scientists to formulate and solve problems across the 
fields of physics and geometry. Historians have tended to forget that what made the 
calculus worth all the efforts to understand it was not ideas about infinitesimals, 
differentials, limits, and the like that were introduced to explain and justify it, but its 
many successes in providing an understanding of the natural world, from the motion 
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of the planets to the transmission of electric signals, and in extending the powers of 
geometry. 

There have been a few notable departures from this scholarly regime in recent 
years. Craig Fraser and Jesper Liitzen have steadily enriched our understanding, and 
other historians (Tom Archibald, June Barrow-Green, Umberto Bottazzini, Christian 
Gilain, and Tom Hawkins, among them) have dealt with various aspects of the devel- 
opment of the theory of differential equations as it entered into the larger pictures 
they were exploring. 


What This Book is Not 


The largest omission is the work of Maxwell and the equations named after him, but 
it seemed to me that the modern theory, and the physical experiments that it explains, 
are largely unknown to mathematics students and would have entailed too great a 
detour to bring to life. In addition, Maxwell’s ideas about the physics involved are 
not the modern ones—and famously, no Continental physicist claimed to understand 
them—so it would have been impossible to do them justice in the space available. 
Disappointed readers should consult Buchwald’s From Maxwell to Microphysics. For 
much the same reasons, I was unable to deal with hydrodynamics and the Navier— 
Stokes equations, but readers may always turn to Darrigol’s Worlds of Flow. 

Another topic that is wholly missing is the use of perturbative methods. Most 
differential equations, and systems of such equations, that arose in practice could 
only be tackled by the method of undetermined coefficients. The idea was to start 
from a simplified version of the problem at hand that, however, admitted an exact 
solution, and to seek the solution to the solution to the actual problem by adding 
more terms to cope with the increased complexity. These might take the form of 
power series, or later trigonometric series, which were fitted to what data there was, 
especially in the important subject of astronomy, and their coefficients adjusted to 
refine predictions and explain other effects. 

Another important topic that it would be good to have included is Sturm-Liouville 
theory, but there is already an excellent historical account in Ltitzen’s Joseph Liouville 
(1809-1882): Master of Pure and Applied Mathematics, and I thought it better to add 
to the stock of historical information about the development of differential equations. 

I would very much have liked to have concluded the course with Poincaré’s ideas 
about flows on surfaces, and his brilliant extension of these techniques in his famous 
memoir on the three-body problem, which would have made an attractive connection 
back to Newton’s Principia Mathematica, but there simply was no room. However, 
there are existing accounts of this subject.! 

And, I admit, the wish to try to say something about the history of partial differ- 
ential equations, surely the largest omission in the history of modern mathematics, 
also played a part in my decisions about what to include. 


'See most informatively (Barrow-Green 1997), and also (Gray 2013), and (Verhulst 2012). 
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So, dear reader, if your favourite topic is not here, and especially if there is not 
a good modern history of it, then the opportunity is there for you to write it and 
in that way fill a gap in the literature. There are short overviews of the historical 
development of differential equation (for example in (Kline 1972)) and there are 
detailed treatments of selected topics, as I have tried to acknowledge and benefit 
from. There is ongoing work by a number of historians of mathematics, but the 
fact remains that the history of mathematics is tied closely to the history of pure 
mathematics through a shared interest in foundations, and the history of classical 
applied and applicable mathematics lags behind. This would be merely unfortunate, 
were it not for the fact that it is through differential equations that the calculus largely 
justified its existence—geometry being another, but smaller, vital field. 

Without histories of differential equations, we lack a significant part of the history 
of mathematics. We cannot properly explain to our students where we are coming 
from and how we got here, we cannot explain the significance of mathematics to 
historians of science, and we are hindered in our attempts to rescue philosophical 
accounts of mathematics from the grip of foundationalists, who see only set theory 
and logic. 

There are considerable losses. We are likely to leap from Newton, Leibniz, and 
the invention of the calculus straight to Cauchy, with perhaps a glance at Lagrange’s 
unsuccessful earlier attempt at rigorising the calculus. In this way, the entire eigh- 
teenth century is largely forgotten and is only dealt with in fragments. The study of 
partial differential equations is reduced to what I jokingly refer to as solving the only 
four partial differential equations that exist: the general first-order partial differential 
equation, Laplace’s equation, the heat equation, and the wave equation. 

This book is an attempt to fill in some of the gaps. 


Sources and Their Uses 


There is inevitably an absence of material in English on this material. Newton, of 
course, has been generously put into English when he did not write it himself, J. M. 
Child translated some of Leibniz’s considerable and mostly unpublished writings of 
relevance in (Child 1920), and for the eighteenth century, there is the remarkable 
and growing resource of the Euler Archive, where almost all of the original work of 
Euler can be found along with many substantial translations. Among the nineteenth- 
century mathematicians, almost all of Riemann’s work is now in English, as are 
Hilbert’s remarks in his Paris address on Mathematical Problems; and Hamilton and 
Green naturally wrote in English. The rest remains in Latin, French, and German, 
and a richer study would embrace Italian and Russian. 

Source books have done something to ease the students’ paths: Struik’s on the 
period 1200-1800 and Birkhoff’s rather freer translations of nineteenth-century work 
are very helpful, and more can be found in the book by Fauvel and Gray (referred to 
here as F&G). Historians’ translations of shorter extracts can also be found in their 
papers. 
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I have therefore added to the collection of works translated into English some 
items from Cauchy (on the existence of solutions to first-order partial differential 
equations) Darboux (on the telegraphist’s equation), Schwarz (on analytic maps 
from a half-plane or disc to a polygon, his alternating method, and part of his paper 
on the hypergeometric equation), and a passage from the introduction Picard wrote 
to one of his papers. 

As for illustrations, I had originally planned to include pictures of most of the 
important mathematicians whose work is discussed in this book, but copyright issues 
posed obstacles in a number of cases. However, these days a great many pictures, 
very often accurately identified, are available on the internet. 


Advice to Instructors 


This book is the fourth and last of my books based on courses I taught, each over a 
period of four years, at the University of Warwick. Together, they cover the emergence 
of a fair amount of mathematics in the standard syllabus at many universities today. 
They have been published at a time when the prospects for courses in the history of 
modern mathematics in Britain have become poor, I believe for two reasons. First, 
in Britain as in many places, there seem to be few prospects for anyone wanting to 
shape a career as a historian of mathematics; students know this, and they seldom 
entertain the idea at the graduate level.” Second, there are problems for anyone 
wanting to run a course in the subject: problems of language, problems of sources, 
problems with assessment. These four books are offered partly as a way around the 
second problem, and that accounts for their content, specifically the three chapters 
on assessment. I wanted to show that there are ways to assess student’s grasp of the 
history of mathematics that are not simply exercises in old mathematics, and the 
result is the adaptation of what my Open University colleagues and I did to more 
advanced topics. 

There are many reasons why a course in the history of mathematics at university 
can benefit students. It humanises the subject, demonstrates the intent behind many 
discoveries, and helps to explain why we have the mathematics we do. It always 
seemed to me that any history of mathematics course best belonged in the students’ 
final year, when they already know enough mathematics for the history to get a proper 
look in. In a world in which few students go on to do research in straight mathematics, 
but many go on to be mathematicians in a huge variety of environments, I believe 
that a historical overview of part of the subject offers at least as much value as any 
other specialism. 

Of course, I do not claim that any of the four volumes is the course to adopt. 
It might well make sense to use any of the books selectively. This one might yield 
a course on partial differential equations, or a short course on the hypergeometric 


?Nor is their much room for history of mathematics in the tightly determined school mathematical 
syllabus. 
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equation, for example. It will depend on the audience. And I would cheerfully admit 
that almost every chapter here is too long to be a lecture; indeed I never taught it that 
way. Each chapter is a resource, and in the absence of other material for the student 
to read, I thought it best to provide enough for readers to engage with. There is more 
than enough for three lectures a week here, but not too much for a week’s study. 
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Chapter 1 M®) 
The First Ordinary Differential si 
Equations 


1.1. Introduction 


The chapter is in two parts. In the first, we look briefly at the discovery of these 
methods for finding tangents, which is, of course, part of the seventeenth-century 
discovery of the methods of calculus. The first inverse tangent problem—the precur- 
sor of differential equations—was asked early on, in 1638, and it is interesting to see 
that neither Descartes nor Leibniz could properly solve it. 

In the second part, we see how the calculus, in Euler’s hands, led to the develop- 
ment of methods for solving various kinds of ordinary differential equation, including 
Debeaune’s. The work of Euler and Bernoulli on vibrating rod and hanging chain led 
to a breakthrough in the study of linear differential equations and the introduction 
of the idea of a basis of solutions. Even more importantly, Euler was able to adapt 
the methods of the calculus to the study of mechanics, and so was able to express 
Newton’s laws of motion for the first time as differential equations. 


1.2 Origins: Inverse Tangent Problems 


In the 1620s and 1630s, various mathematicians—Pierre Fermat, René Descartes, 
and Gilles Personne de Roberval among them—began to develop methods for finding 
tangents to curves, either at a given point on the curve or from an arbitrary point and 
to the curve. With the success of these methods, it became possible to think of raising 
and answering the opposite question, that of finding a curve given some properties 
of its tangents. 

The person with the honour of having formulated the first inverse tangent problem 
is Florimond Debeaune. Debeaune was a wealthy member of the nobility in his 
hometown of Blois, where he was born in 1601 and where he became a counselor 
at the Court of Justice. He also had a reputation as a high-quality lens grinder, and 
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Fig. 1.1 Debeaune’s 
problem 


in 1639, Descartes wrote to him to ask him to design a machine that would make 
hyperbolic lenses. The project failed, but they remained in touch, and Debeaune went 
on to write the Notes briéves that were published in 1649 in the first Latin edition of 
Descartes’s La Geometrie. In this work, he showed that the equations y? = xy + bx, 
y* = —dy + bx, and y” = bx — x? represent a hyperbola, a parabola, and an ellipse, 
respectively. 

Debeaune was led to propose it in 1638 as a result of his study of Descartes’s La 
géométrie—so soon did Descartes’s ideas begin to transform geometry.! He raised 
it out of his interest in explaining mathematically why a plucked string vibrates as it 
does, specifically, in explaining why the frequency with which the string vibrates is 
independent of the force with which it is struck.” It was one of four problems that he 
presented to the mathematical community, and it has come down to us in the form 
of a letter to Roberval. 


1.2.1 Debeaune’s Problem 


Debeaune stated the problem this way.* 


Let there be a curve AX E whose vertex is A, axis AY Z, and the property of this curve is 
that, having taken any point on it you wish, say X, from which the line XY is drawn as 
a perpendicular ordinate to the axis, and having taken the tangent GX N through the same 


‘Descartes’ La Geometrie was published in 1637 as one of a number of appendices in his Discours 
de la Méthode. 

2Debeaune to Marin Mersenne, March 1639, Mersenne Correspondance VIII, 348, in F&G 11. 
B1(b). Marin Mersenne was a Minimite friar who operated an informal postal service for the 
communication of letters across Europe about science and mathematics. 


3Debeaune, letter to Roberval, sent to Marin Mersenne, Mersenne Correspondence VII, 142-143, 
in F&G 11.B1 (a). 
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point X, and extended the perpendicular X Z to it at X until it meets the axis, there will be 
the same ratio of ZY to YX as a given line, like AB, has to the line YX — AY (Fig. 1.1). 


Draw the axis, AY Z, a curve, AX E, a tangent, GX N, at some point, X, on the 
curve, and erect the perpendicular, X Y, as shown. Locate the line segments ZY, YX, 
and AY, and the line segment AB that provides a unit of length. Debeaune’s problem 
asks to find the curve, AX E, with the property of its tangents that 


ZY AB 
YX  YX—AY~ 


Debeaune probably expected an answer in the form of a recipe for constructing 
points on the curve geometrically, rather than as an equation in some system of 
coordinates, but in any case, he was to be unlucky. Roberval showed that line through 
B drawn at 45° to the axis is an asymptote to the curve, and in October 1638, Descartes 
gave a mechanical description of how the curve might be drawn approximately, which 
was sufficient to confirm Roberval’s result, but no one could answer the challenge 
before Debeaune died in 1652. 

Ithas proved the characteristic of inverse tangent problems—differential equations 
as they became called—that they can be easy to state but very difficult to solve. One 
merit of the calculus was to be that it not only provided a way of stating inverse 
tangent problems but it also provided a set of rules for manipulating the problem 
symbolically until it could (quite often) be solved, at least in the sense that the 
solution curve could be described via equations or formulas. 


1.2.2. Other Inverse Tangent Problems 


Inverse tangent problems often arose naturally in the contemporary study of physical 
and astronomical problems. 

For example, in the 1670s, Claude Perrault, who is best remembered as the archi- 
tect of the east wing of the Louvre Palace in Paris, asked for the curve traced by a 
heavy weight drawn behind someone walking along a straight line. The solution is 
a curve called the tractrix (see Fig. 1.2) and had previously been considered by both 
Newton and Leibniz, although they did not identify it as a solution to this problem, 
and later by Huygens. More formally, it is the curve with the property that the length 
of the tangent from a point on the curve to a fixed line is a constant. 

In his Principia Mathematica [206], Newton investigated the paths of particles 
that moved subject to forces directed at a central point. Later, the paths of particles 
moving under gravity and encountering various forms of air resistance were studied 
by Newton and others; these too arose as the answers to inverse tangent problems. 
In these cases, it is the instantaneous direction of acceleration that is known, not the 
instantaneous velocity, so the problem is not strictly an inverse tangent problem but 
a generalisation. 
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Fig. 1.2. The tractrix y 


In 1696, Johann Bernoulli challenged the mathematical community to find the 
curve along which a sliding bead would descend most quickly between two given 
points—as we shall see in more detail in Chap.2 the answer is a cycloid with a 
vertical tangent at the starting point (a cycloid is the curve traced by a point on the 
rim of a wheel rolling along a straight line). Five mathematicians responded, among 
them Newton, who stayed up all night and answered the question the day he received 
it; Bernoulli said he recognised the solution as Newton’s as one recognises the lion 
by his paw (it had taken Bernoulli 2 weeks). 

As these dates indicate, these problems were very difficult, and the calculus 
enabled some progress to be made on a broad front. Unsurprisingly, it was not always 
quite as simple as that. 


1.3 From Inverse Tangent Problems to Differential 
Equations 


In the 1690s and early 1700s, Johann Bernoulli became the leading exponent of the 
calculus, which he and his older brother had learned by corresponding with Leibniz 
and reading his published papers. He went to Paris in 1691 and got himself hired to 
teach the Marquis de |’ H6pital the new calculus, and as a result, the first book on the 
differential calculus appears with de |’H6pital as the author. 4 

From the book, we can see that Bernoulli’s definition of integration is interesting, 
for he defined it as Newton had done as the inverse of differentiation and not, as 
Leibniz did, as an infinite sum, and he gave several methods for finding areas. Then 
Bernoulli turned to inverse tangent problems and solved a variety of examples. The 
concluding, and arguably most important, part of the book was an exposition of how 


4See L Hopital [186]. 
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problems in geometry or mechanics can be translated into the language of calculus. 
Calculus was still very new, and showing how to express problems using it could be 
the hardest part of a mathematician’s work. 

In summary, translation procedure would go as follows. 


1. Set up a system of x and y coordinates with respect to which the solution to the 
problem can be expressed as a curve, and then formulate the problem in terms of 
equations involving these coordinate variables. 

2. Interpret the problem as a statement about the relationship between neighbouring 
points on the curve; this will take the form of an equation involving differentials. 

3. Pass back from the differentials to the finite quantities (x and y) and so determine 
the precise form of the equation that describes the solution curve. 


This method was not invented by Bernoulli. Leibniz had already tackled 
Debeaune’s problem this way, however clumsily—Leibniz was prone to careless 
errors. Bernoulli raised his approach to the level of a systematic method for tackling 
many such problems, and this moment of transition is marked by a change of name. 
Henceforth, inverse tangent problems became called differential equations, because 
in step (2) they were literally expressed as equations involving differentials. That 
the new name stuck shows how closely the new methods of the calculus became 
associated with problems involving instantaneous change or changes from point to 
point along a curve. 

None of this would be of any use if the resulting differential equation could not 
be solved. Now, to Bernoulli and his contemporaries, a solution ideally meant a 
geometrical description of the required curve. This is a global description of a curve, 
such as is usually given for a circle, a conic section, or a few other curves such as 
the cycloid. The calculus, however, did not always lend itself to providing such a 
thing. When stage (3) has been carried out successfully, the solution is expressed as 
an equation in coordinates x and y that defines a curve (depending on some initial 
conditions). But to mathematicians of the late seventeenth century, a further step was 
required in which the curve was characterised by some property by which it could 
be recognised, much as we today routinely gloss a curve defined by a quadratic 
equation as a particular sort of conic section. This amounts to reversing stage (1) 
and is much harder than traversing it, and since that was often hard enough, going 
backwards often proved to be too difficult. Indeed, why should the solution curve 
be any kind of known curve? However, if this step is not taken, the curve can at 
best be drawn pointwise, and important properties of it might remain undetected.° 
Gradually, mathematicians began to accept equations as the solution and not to look 
beyond them; and the more they did so the more mathematics became more formal 
and algebraic, and less geometrical in nature. 

The catenary problem is a good example. This asks for the shape of a heavy- 
weighted chain, and so it is of obvious interest to bridge builders. Galileo had sug- 


5For the same reason, British mathematicians spoke more and more of fluxional equations because 
Newton had expressed his ideas in terms of fluxional quantities, such as the rates of change of 
quantities. 


©This was an issue that Descartes recognised when he put forward his ideas about geometry in 1637. 
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gested in 1638 it would “assume the form of a parabola’, but in 1646, Huygens 
showed that this is incorrect.’ The problem was first solved by Leibniz and Johann 
Bernoulli independently in 1691, and then raised again by Jakob Bernoulli in the Acta 
Eruditorum in 1701 as a challenge to the mathematical community.® This invites the 
question of what Jakob Bernoulli thought he was doing asking a question that had 
been dealt with successfully a decade before, and Paolo Fregulia, whose account is 
our guide here, speculates that his aim was “to put this solution in a more theoretical 
general context”, namely, isoperimetrical problems.’ 

Johann Bernoulli’s solution, written in 1701 but published only in 1706, proceeded 
by first expressing the force on an infinitesimal piece, AB, of the chain in terms 
of the weight of the string hanging below A, then by calculating the force at the 
neighbouring point B, and then by arguing that since AB does not move, the effect 
of the forces at A and B must cancel out. So the raw ingredients are the length, s, of 
the chain from the far end E to the point A—which is proportional to its weight—and 
the differentials dx and dy which come in because the string is curved and so the 
forces at A and B do not point in quite the same direction. The differential equation 
Bernoulli obtained is 

dy s 


9 
dx a 


where a is a constant. Stage (ii) was completed when the variable s, which depends 
on the values of x and y, was eliminated in favour of an explicit expression involving 
x and y. We omit the details of how this was done, and pass straight to the resulting 
equation: 

adx 


Vx? + ax 


This completes stage (ii). To carry out stage (iii), Bernoulli noticed, as Newton had 
done much earlier, that the simplest kind of differential equation one could hope to 
get was of the form 


dy = 


(something in x) dx = (something in y) dy 


because you could then hope to integrate both sides. The method of trying to arrange 
for this to happen he called the method of separation of variables. When the variables 
do not separate, Bernoulli enriched the method by suggesting that one looks for 
new variables with respect to which the differential equation does separate. This 
meant setting aside all qualms about the nature of differentials and manipulating 
them formally just as one does finite quantities in elementary algebra. Bernoulli’s 
techniques are simple algebraic devices—his insight was in seeing that such methods 


7See Galileo Two New Sciences [113], 149, and Huygens [149]. For a historical account, see 
Bukowski [26]. 


8 See Bernoulli [8]. The solution is the curve called a catenary—the name derives from the Latina, 
catena, for a chain—with equation y = cosh x. 


° See Fregulia and Giaquinta [109]; the quoted remark is from a private communication. 
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Fig. 1.3. Bernoulli’s B 
formulation of Debeaune’s 
problem 
& 
F A D E 


are applied in their new setting. In so doing, he was following the lead of Newton 
and Leibniz. 

These ideas are well illustrated by Bernoulli’s discussion of Debeaune’s problem, 
which he gave in one of a series of lessons to the Marquis de I’ H6pital in 1691.!° 


Another such example is the problem set to M. Descartes by M. Debeaune, the solution to 
which is not in his works but can be found in his Letters (vol. III, No. 71). The solution of 
it does not appear to be very easy according to our method, indeed at first sight the problem 
appears impossible by this method. But we shall see that by a change of variables it becomes 
easy to separate them, and that this problem can be solved completely once the quadrature 
of the hyperbola is given, for the curve is mechanical (Fig. 1.3). 


The problem goes like this: a line AC makes an angle of half a right angle with the axis 
AD, and E is a given constant line segment; what is the nature of the curve AB in which 
the ordinates BD are to the subtangents FD as the given E is to BC? 


Solution. Let AD = x, DB = y, E =a, suppose by hypothesis that dy : dx =a: (y — x), 
then adx = ydy — xdy. From this equation the nature of the curve is to be found, either 
by integration or by rewriting y with dy on one side and x with dx on the other, for then 
two areas can be found and by comparing them the nature of the curve can be found. But 
the equation just found cannot be integrated, nor can x and dx be separated from y and 
dy; however, it can be changed into another by substituting the value of another variable. 
Therefore let y — x = z, y =x +zanddy = dz + dx. The equation just found transforms 
into this: adx = zdz + zdx or adx — zdx = zdz and dx = zdz : (a — z). Therefore these 
two variables separate, and we are led to the curve on multiplying by a, adx = azdz: 
(a — 2). 


[...] 
Corollary I. The curve AB has its asymptote parallel to AC. 


Corollary I. The space [i.e. area] ADB = xy + ax — 5yY- 


We see that Bernoulli first stated the problem, then he introduced coordinates 
(stage (i)) and then differentials (stage (ii)). Aware in advance that the method of 
separation of variables does not apply to the differential equation adx = ydy — 
xdy, he introduced the change of variable y — x = z, which implies dy — dx = dz, 
thereby extending a method used for finite quantities to differentials. Now he had an 
equation in the variables x and z in which the variables do separate, 


10See Bernoulli, J. Opera Omnia 3, 1742, 423-424, and F&G 13.B1. 
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z 


dx = dz. 


a-Zz 


As was thought necessary at the time, he did not accept the solution to the differential 
equation that results from integrating both sides, 


x = —z—aln(a—2z), 


but went on to give a geometric interpretation of the result. Helpful though that 
can be, the tradition was to lapse in the face of too many cases where it could not 
profitably be done. 

However, if we look at some ideas that Bernoulli wrote down only a few years 
later, in 1702, we can see the growth of formalism.'! By now, he was quite clear that 


expressions such as , where f is a constant, are differentials of logarithms, 


/ dx 
x+f 
is alogarithm. He now introduced a variety of changes of variables to deduce that cer- 


tain previously encountered integrals can be expressed in terms of either logarithms 
or circular arcs. His list includes the integral 


and therefore that 


/ dx 

Vx? + ax’ 

which arose in the catenary problem, and the logarithm lurking in Debeaune’s prob- 
lem. Bernoulli regarded his changes of variable as enabling him to pass from circular 
arcs to arcs of hyperbolas and back, which made his geometric interpretations of ana- 


lytic formulas more flexible. Interestingly, they involved him in introducing complex 
numbers, which was to occasion some confusion later on. 


1.4 Differential Equations 


We turn now to study how Euler rewrote the calculus. 

The Leibnizian form of the calculus, which was the form adopted by the math- 
ematicians of continental Europe, was initially seen as a set of algorithms for han- 
dling problems about curves. These algorithms work because they apply to formal 
expressions involving variables, and the two basic operations, differentiation and 
integration, d and /', obey rules such as 


' See Bernoulli [11] and F&G 13.B2. 
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d(uv) = udv + vdu and af u=uU. 


The connection between these formal operations and geometry arises from their 
geometrical interpretations—for example, d has to do with finding tangents, and [ 
with areas. 

Euler rewrote the calculus by regarding it being about formal expressions and 
replacing the concept of a curve with that of a function. Calculus is about expressions 
that can be differentiated and integrated. It is only about curves in so far as they can 
be described by formal expressions, which allow one to use differentiation to find 
tangents and so forth, and the solution to a differential equation is not to be expressed 
as a curve but as an explicit or implicit function of the coordinates. 

As is well known, Euler’s analysis of sine, cosine, and the exponential functions 
is unified by the function idea. He expressed them as power series and treated them 
formally or algebraically, there is very little geometry. In particular, his controversial 
solution to the problem of defining the logarithm of a negative number proceeded by 
defining log as the inverse function to exp.! 

Euler’s emphasis on the calculus, and indeed much of mathematics, as a science 
of formal expressions widely restructured mathematical theory. His treatment of 
Debeaune’s problem provides another example. Euler went from the differential 
equation 

zdz 


(a —2z) 


dx = 


to an answer in this form: 

x+z+alog(a — z) = constant. 
He saw this equation as the answer, and saw no need for a geometrical interpretation. 
Indeed, in his definitive account of the integral calculus (published between 1768 
and 1770) he did not even mention Debeaune’s equation by name when he gave a 


complete account of how to solve all differential equations of the form 


(a+ Bx + yz)dx = (6+ ex + pz)dz 


(this equation reduces to Debeaune’s on settinga =a,8=O0=d=¢e,y=-l,u= 
1). 

We shall see that Euler’s mathematics is full of investigations of objects defined 
by differential equations or integrals. Problems are expressed as equations (finite 
or differential) and solved by finding power series expansions or other algebraic 
reformulations. If one sees mathematics as having three aspects—problems, methods, 
and results—then one might say that Euler very often saw problems algebraically and 


12See Euler [78]. 
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solved them algebraically, expressing his results either in finite terms or as infinite 
ae 
series. 


1.5 Linear Ordinary Differential Equations 


It had been known from the time of Newton and Leibniz that the calculus provides 
good answers and not-so-good answers. A three-way correspondence between Euler, 
Johann Bernoulli, and Johann’s son Daniel provides a good illustration of how this 
difficulty was addressed in the context of finding the shapes of a vibrating, clamped 
rod (setting aside the question of how the shape varies in time). 

Daniel raised this problem in a letter to Euler of 18 December 1734 (setting aside 
the question of how the shape varies in time). He wrote to Euler again on 4 May 1735 
to say that he had found a differential equation that describes its shape, but that the 
only solutions he could find to the equation, which involved sines and exponentials, 
seemed inappropriate. Euler replied with a solution in the form of a power series, 
which he wrote up and presented as a paper [71], E40 to the St. Petersburg Academy 
of Sciences. This is not a good answer. As so often, the power series is unilluminating, 
and it concealed from Euler the fact that the rod can vibrate in several distinct ways. 

Then, in 1739, Euler spotted a much better approach and found a much better 
answer, which he described in a letter that he wrote to Johann Bernoulli on 15 
September, '* and more fully in a paper (E62) published in 1743. 

In this letter, he proposed a simple, general method for all differential equations 
of a form he described, which we could call linear ordinary differential equations of 
arbitrary order and with constant coefficients. His method reduces these problems 
to the solution of a polynomial equation and establishes that the answer to the dif- 
ferential equation is always given as a sum of exponentials, sines and cosines. He 
wrote: 


Ihave recently found a remarkable way of integrating differential equations of higher degrees 
in one step, as soon as a finite [algebraic] equation has been obtained. Moreover this method 
extends to all equations which, on setting dx constant, are contained in this general form: 


_ady , bddy , cd*y | dd*y | ed°y 


T T t — + t __ 0. 
dx dx? dx3 dx4 dx> a 


To find the integral of this equation I consider this equation or algebraic expression: 


l-—ap4 bp* cp” 4+ dp* ep? + etc. = 0. 


If possible this expression is resolved into simple real factors of the form 1 — ap: if, however, 
this cannot be done resolve it into factors of two dimensions of this form 1 — ap + Gpp, 
which resolution can always be done in reals, for whatever form the equation may have 


'3Qne person’s method can be somebody else’s problem, and so forth, but the trichotomy is no less 
useful even so. 


14See Enestrém [66], 33-38, Cannon and Dostrovsky [28], F&G 14.A1(a), and Euler [99],00213. 
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it can always be put in the form of a product of factors either simple, 1 — ap, or of two 
dimensions 1 — ap + (pp, all real. This resolution being done, I say that the value of y is a 
finite expression in x and constants, obtained from all the members which have been factors 
of the algebraic expressions, and singular members supply singular terms of the integral. 
Certainly the simple factor 1 — ap gives as member of the integral Ce*/“, and a composite 
factor 1 — ap + Gpp gives this member of the integral 


e 1ox/28 (c sin ee + Dcos so) 


where for me sin A. and cos A. denote the sine and the cosine of arcs in a circle of radius 
= | : however it is to be noticed that if the expression 1 — ap + (pp cannot be resolved 
into simple real factors, when 43 > aa, still the integrals are real. 


Let the following be taken as a suitable example 
K*dty 


ydx* = K*d*y, or y— es = 0; 


this gives rise to the algebraic expression 1 — K*p*, whose real factors are these three 
1— Kp,1+Kp,1+ K?p’; and from these spring the integrals of the equation 


: x x 
y=CeV/* 4 De/K 4 Esin A.— + Fos A.—; 
K K 


in which expression, because a four-fold integration has been done in one operation, there 
are four new constants as the nature of the integration demands. If it would please you, most 
excellent sir, I shall write down the method of proof on another occasion. 


It is not clear how Euler came upon his brilliant idea. It falls out, however, as 
soon as one tries to see if the differential equation is solved by functions of the form 
y =e ?*, Because 


d d’ 
a uae = p’y, andsoon, 


when y = e * is substituted into the equation, the resulting equation is 
—px 2 3 4 5 = 
e ?*(l1—ap+bp* —cp° + dp" — ep? + etc). = 0. 


The expression e~* is never zero, so it can be divided out, and therefore, as Euler 
claimed, y = e ?* is a solution of the differential equation if p is a solution of the 
polynomial equation. 

To find the values of p, Euler claimed that the polynomial equation can always 
be factored into linear terms of the form 1 — ap and quadratic terms of the form | — 
ap + @pp. This is an example of what came to be called the fundamental theorem 
of algebra, which was widely believed at the time, but not proved. Euler then solved 


Ea/ a2—43 


these equations for p and found p = 1/a and p = * 7G 


that the second expression is equal to — 7B 
The first case leads to the solution y = e~*/“. The second case leads to the solution 


, respectively. Notice 
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(33 ae) «mon ( Se nee). 


2B + 28 
Now he already knew that 
e? 4 — eP(cosq + ising), 


so he saw that 


(4 we) ax x/4G—a2 |. xV48-—a2 
+ = exp ( ) > og ae . 


28 2B 


The term involving the sine function remains a solution when multiplied by any 
constant, and so the factor of i can be removed by multiplying by i, and Euler’s 
solutions are finally obtained. 

As we remarked, Daniel Bernoulli had already noticed that exponentials, sines, 
and cosines were among the solutions, but Euler was the first to see that every solution 
could be written in terms of them. As a result, his new solution to the differential 
equation for the vibrating rod is much better than his earlier power series solution, 
because it becomes possible to see what some of the solutions actually look like, and 
this is very instructive. For example, each of the next four functions is separately 
a solution: y = e~*/*, x/K y = sin(x/K), and y = cos(x/K), which can be 
called its basic modes of vibration of the rod. Moreover, the new approach brings to 
light that the shape of the rod at any instant is a certain sum of these basic modes 
with constant coefficients. 

Furthermore, because the rod is fastened to the wall and protrudes, say, horizon- 
tally, and the mortar is secure and immovable, any solution is subject to the two 
initial conditions that when x = 0 necessarily y = 0 and a = 0. This eliminates 
some combinations of the basic modes, and the allowed solutions (at any moment of 
time) are all of the form 


ae** + Be-** — (a + B) cos Kx — (a — f) sin Kx. 


This also explained what Daniel Bernoulli had noted experimentally: a thin rod 
clamped to a wall can be made to emit several different sounds as it is plucked, and 
indeed, several different sounds at once because it can be in several distinct shapes.!> 

Euler informed his 72-year-old former professor Johann Bernoulli of his claims 
about this class of differential equations but did not send him a proof. Not to be 
outdone, Bernoulli replied with a proof in early December 1739, but his approach is 
interestingly old-fashioned. He first showed how to reduce the differential equation 
to a polynomial equation, much as Euler had done. However, Bernoulli adhered to 
the geometric language that Euler’s work would gradually drive out, and interpreted 
the solutions, which he wrote in the form y = n*!P as “logarithmic curves whose 


'5 Bernoulli used a needle. 
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subtangent is to be found”.!° In addition, Bernoulli regarded the equation p* — K* = 
0 as the same as p = K and remarked that whereas he had one solution, Euler had 
exhibited several. He commented that for this to be the case “my logarithms will 
be impossible or imaginary, but it is also the same in your solution, allowed to be 
more general, for you must let K be impossible or non-real”. This shows that complex 
numbers were puzzling when they occurred in problems involving real quantities, but 
were nonetheless accepted, perhaps as something that needed to be better understood. 


Euler’s attitude to what he would consider an answer to a differential equation 
makes one crucial advance over Johann Bernoulli’s. The methods they used to solve 
differential equations were not so very different: changes of variable, cunning sub- 
stitutions, and so on, although certainly Euler’s insight that reduces these differential 
equations to polynomial equations was a breakthrough. But changing what was con- 
sidered an acceptable answer was an essential ingredient in advancing the calculus— 
and Euler seems to have had some success in convincing Johann Bernoulli, too, of 
the value of thinking in this way. 


The first two volumes of Euler’s Institutionum Calculi Integralis (E342 and E 
366) give a good indication of what he could do by the late 1760s (the book was 
presented to the St. Petersburg Academy in August 1766; volume | was published in 
1768, volume 2 in 1769). Euler investigated a number of different kinds of ordinary 
differential equations, looking for simplifications and for general methods. Quite an 
amount of insight into complete solutions, particular integrals, and initial conditions 
is accumulated. In volume 2 Chap. 8 Euler considered the second-order ordinary 
differential equation, and began by solving the linear equation by the method of 
undetermined coefficients. (A particular type of this equation is the hypergeometric 
equation, which we shall investigate in Chap. 11). 

Given the equation 

ec A 

dx? a dx eee 
in which M, N, and X are functions of x, Euler began, as usual, by taking particular 
cases. His first example was 
d’y 


2 n 
bx") — 
x“(a + bx aa 


d 
+x(e+dx")> + (f + ex")y = 0. 
He looked for a solution of the form 
r n 2n 
x*(A+ Bx" + Cx" +--+), 


and by looking at the lowest power of x was led to the equation 


‘©The subtangent to a curve at a point P is the distance from the point where the tangent meets the 
x-axis to the point on the x-axis vertically above or below P. So, if P has coordinates (x, y) and 


d 
ca p at P, then the subtangent is ae 
dx Dp 
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MA — Da+Ac+ f =0. 


Once this is solved, the constants B, C,... can all be found recursively in terms of 
A. 

Euler also looked for solutions in which the powers of x decrease, at the cases 
of both real and imaginary values of A, and the more difficult case when the values 
of \ are either the same or differ by an integer and the method provides only one 
solution, not two. In this case, Euler found another solution with a logarithmic term. 

It seems that linear differential equations of higher order were beyond Euler’s 
reach not because of the method of series but because of problems with the corre- 
spondingly higher order equation for A. There were, of course, no explicit methods 
for solving the quintic equation, and nor was there a secure proof of the so-called 
fundamental theorem of algebra. So although Volume 2, Sect. 2, Chap. 2 of Euler’s 
book covers the third-order linear equation with constant coefficients, the solutions 
of which are of the form e* for the values of \ that satisfy the corresponding cubic 
equation, and Euler dealt with both the case of distinct roots and the case of repeated 
roots, the extension of the analysis to higher order equations became lost in the 
details. 


It had not been a hundred years since Leibniz had struggled to master one inverse 
tangent problem, but by the 1760s, Euler had a theory of many different kinds of 
differential equations. It embraced differentials of various degrees; homogeneous 
equations; solution methods that introduced multipliers or employed the method of 
infinite series or relied on a method of successive approximation. The formal side 
of the calculus was evidently being deployed, so much so that examples appear for 
the first time only on page 355—surely a good sign that we are in the presence of a 
theory rich enough to keep mere examples at bay. 


1.5.1 A Note on the Adjoint Equation 


I mention here that Euler’s work was extended by the young Joseph-Louis Lagrange 
in a 200-page memoir [171] that he published in Miscellanea Taurensis. Lagrange 
established that a linear equation of order n of the form 


Ly+M ae +N a + = 1 
: dt dt? = 
where L, M, N,...,7 are functions of ft, will have n solutions (independence is 


implied but not stated) as the method of undetermined coefficients would suggest, 
and proceeded to investigate interesting cases and methods for reducing the order 
of the equation. This led him to discover that one can associate to a given ordi- 
nary differential equation another that, if solved, enables one to reduce the order of 
the original ordinary differential equation by one. Repeating this trick on the new 
equation returns almost to the original one; it actually comes back as 
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In the nineteenth century, the second equation came to be called the adjoint equation 
of the first equation, and so one could say that Lagrange had proved that the adjoint 
of an equation is (the homogeneous form of) the original equation. 

It will be enough to demonstrate the trick on the second-order equation 


Wey? ey ter 
y dt dt 


Lagrange multiplied both sides by an unknown function z = z(t) and integrated, to 


get 
dy dy 
Lyzdt M—zdt N—zdt = | Tzdt. 
fv: +f ar +/ Age fr 


He integrated the terms involving M and N by parts, so the equation becomes 
dM dN 
[ vxzat + Myz— / — yd + Ny'z -{ —ylat = / T zdt, 


and, integrating by parts again, 


dMz dNz a Nz 
Lyzdt + Myz— | ——ydt+ Ny’ dt = | Tzdt 
/ yzdt + Myz / Fi ydt+Nyz-—y a +f Te y / zdt, 


which he rearranged with respect to y, when it becomes 


dNz dy dMz. d*Nz 
Mz — — —wN Lz — — dt = | Tzdt. 
v( ‘i m+? z+ f (2 dt = ay? / - 


So, if z is chosen to be a solution of the equation 


then the original differential equation reduces to 


dNz dy 
Mz) 4 ze | Tedt, 
y( : m+? : ite 


which is of degree one less than before. As Lagrange pointed out, the adjoint equation 
is simpler than the original one because it is homogeneous, and if it can be solved the 
original equation is also simpler because has been reduced to one of lower degree. 
Lagrange then turned to other questions: the motion of fluids, the vibrating string, 
the motion of the planets. All in all, it is a formidable paper, familiar with the work 
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of d’Alembert on fluids and Euler on the vibrating string, although Lagrange sided 
more with d’Alembert than Euler over the generality of the solutions of the wave 
equation. 


1.6 Exercises 


1. Solve Debeaune’s problem and show that the solution with the stated initial con- 
ditions has the line x + y = 0 as an asymptote. 
2. Solve Euler’s equation 


a d 
4+ x(c-+ dx") + (f + ex")y =0. 


2 n 
bx") 
ANGUS ie dx 


What qualitative features of the solution are apparent to you (if any)? 


Questions 


1. What qualitative features of a function are apparent to you from its power series 
representation? Consider, for example, the series for exp, cos, and sin. Is it at all 
obvious that the series for cos and sin define periodic functions? 

2. How would you attempt to graph the equation for the tractrix if you did not 
know of its origins as a problem in physics? In the light of your experience, how 
well do you think a mathematician of the late seventeenth century could claim to 
understand a curve knowing only its equation? 


Chapter 2 ®) 
Variational Problems and the Calculus Beckie 


2.1 Introduction 


Inspired by calculus, which made problems look simple that not long before no one 
had dared to raise, mathematicians began to ask a variety of questions about curves. 
We met some in the previous chapter that led to inverse tangent problems, but others 
were to lead to a new branch of the calculus and ultimately to new principles for 
the study of mechanics. As such, if they were solved at all they were solved by 
ingenuity rather than a systematic method, but—as we shall see—the insights that 
were produced on the way were often deep and lasting.! 

Very likely the oldest, most attractive, and most famous problem of the kind we are 
about to discuss is known as Dido’s problem, which asks for the shape of the largest 
area bounded by a straight line and a curve of given length, or, in some versions, 
the greatest planar area enclosed by a curve of given length. According to various 
mythological sources, Dido fled the city of Tyre, perhaps around 825 BCE, and came 
to a place on the North African coast where she was granted permission to have much 
land as a strip made from oxhide could enclose. She cut the hide long and thin, and 
enclosed an area upon which the city of Carthage was founded—but which problem 
she solved, and what her solution was, mythology does not make precise.” 

The problem was brought to people’s attention in one of Lord Kelvin’s popular 
lectures in 1893, and the mathematician Adolf Kneser solved it his textbook [160] 
in this form: Find the curve of given length joining two points A and B that, together 
with the chord AB, encloses the greatest area. The solution is a circular arc with 


'This chapter follows Fregulia and Giaquinta [109] to which readers are referred for much fasci- 
nating information. 

7In Virgil’s Aeneid, Dido then falls in love with Aeneas, who was seeking a new home after the 
destruction of his native Troy, but when he abandons her she commits suicide, calling down endless 
hate upon him. This was the origin of the Punic war centuries later between her city, Carthage, and 
Rome, the city founded by descendants of Aeneas. Virgil’s poem is most likely incompatible with 
what we know of the Trojan wars. 
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the line as a chord, as we shall see in Sect.26.3. But he returned to it in a paper 
[161], where he pointed out that the problem is far from being the simplest in the 
calculus of variations. It is entirely possible that Dido could have taken the problem 
to mean: Find the curve of given length joining two points on the coast that encloses 
the greatest area, and this allows for the end points to be variable. In this case, too, 
the solution curve must be a circular arc, but it is not clear that the maximum area is 
necessarily attained.” 


2.2 Bernoulli’s Problems 


In June 1696, Johann Bernoulli added a challenge to the mathematical community 
to a paper of his in the journal the Acta Eruditorum: 


Given points A and B ina vertical plane to find the path AM B down which a movable point 
M must, by virtue of its weight, proceed from A to B in the shortest possible time. 


This problem, the problem of quickest descent or, to give it the name for the curve 
that Bernoulli gave it from the Greek, the brachistochrone, is emblematic of the topic. 
Bernoulli added that the solution was not the straight line going A and B but was 
in fact a curve well known to mathematicians, and promised to publish the solution 
if no one could find it. He also sent the problem in a letter to Leibniz on 9 June 
1696, who replied on 16 June with a solution and the suggestion that Bernoulli delay 
publishing the solution because the journal travelled only slowly across Europe. 

In May 1697, Bernoulli published his solution, and one by his brother Jakob, 
as well as discussions of the problem by Tschirnhaus and de I’H6pital. Leibniz 
withdrew his solution, saying that it was too close to the ones proposed by the 
Bernoulli brothers. 

Before we look at Johann Bernoulli’s solution, we should note that the problem 
was not original with him. It had been proposed before, by Galileo, in his Dialogues 
Concerning Two New Sciences, where he gave a fallacious argument that purported 
to show that the solution was an arc of a circle.* 

This mistake comes at the end of the long dialogue on the third day of the Two New 
Sciences, where Galileo recorded his lasting contribution to the study of motion: his 
laws of falling bodies.” He had studied the times taken by balls to roll down inclined 
planes of various slopes, thus slowing their rate of descent to lengths of time that 
could be measured by accurate, regular counting. He was led to proclaim that a 
uniformly accelerated body will fall as far in an interval of time as one moving with 
a constant velocity that is the average of the initial and final speeds of the first body. 
Furthermore, the distance covered by the accelerating body in equal intervals of time 
increase with the square of the time. 


3We look at a solution to this problem on Sect. 26.3.1 below. 
4See Galileo [113] Theorem XXII, Proposition XXXVI, Scholium. 
5For an extract, see F&G 10.B4. 


2.2 Bernoulli’s Problems 19 


Fig. 2.1 Galileo’s law of O 
falling bodies a 


Then he noted (see Fig.2.1) that “If a body falls freely along smooth planes 
inclined at any angle whatsoever, but of the same height, the speeds with which it 
reaches the bottom are the same’. Crucially for present purposes, he immediately 
remarked (Theorem III, Proposition I) 


If one and the same body, starting from rest, falls along an inclined plane and also along a 
vertical, each having the same height, the times of descent will be to each other as the lengths 
of the inclined plane to the vertical. 


We would see this by resolving the velocity along the slope into its horizontal and 
vertical components. One ball falls from O to P;, a distance of /;, which it reaches 
with velocity v;. Another ball rolls from O to P, a distance of /, which it reaches 
with velocity v. We have, by conservation of energy, 


vj =v, 1, =Icosa, 


SO 
V1 Vv 


i lcosa’ 


We also know that, because the acceleration is uniform, the time to fall from rest 
through a distance h is h times half the velocity at h, so 


2 
t} =l;— andt=l/-, 
UI Vv 


so 


as Galileo claimed. 

Galileo’s argument was not that different: he measured the magnitude of the force 
acting on the ball on the slope by the weight of a ball hanging vertically down from 
the top of the slope that was attached to the first ball by a cord and such that the two 
balls did not move. 

The quantification of velocity and acceleration by Galileo was to be exactly what 
was needed to study motion with the advent of the calculus. 
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Fig. 2.2. Snell’s law of 
refraction 


Equal significance is attached to Fermat’s argument that the path of light in a 
varying medium is the one that takes the least time. In the late 1630s, he had already 
been in an argument with Descartes about the refraction of light and the explanation 
of Willebrod Snell’s law for refraction. Snell’s law states that when a ray of light 
leaves one medium and enters another there is a constant ratio r, determined by the 
two media, such that 

sin 6, / sin 62 = r, 


where 6, and 6 are the angles the light makes with the normal at the point of crossing 
in the two media (see Fig. 2.2). 

In 1657, Fermat learned of the Greek mathematician Heron’s ideas about why 
light travels in straight lines (in a constant medium) and saw that he could adapt 
them to explain refraction. He supposed that light travels at one speed in air, say v1, 
and another, slower, speed in water, say v2, and then wrote down the time of travel 
between a point in the water to a point in the air on the assumption that it travelled 
along straight lines in each medium. His ad hoc techniques for finding the minima 
of certain quantities were up the task, and he deduced that in the present set-up, 


sin 0; Vv} 


sin 0, v2 


We can argue the same conclusion slightly more rigorously. We choose a point 
A, that is a; units below the surface of the water, and a point A> that is a2 units 
above it, and we suppose they are d units apart horizontally. We suppose that the 
light travels along a straight line from A, to a point P on the water surface and then 
along a straight line to Ay. With angles with the normal at P as given, we have for 
the horizontal distance (Fig. 2.3) 


a, tan 6, + a) tan) = d. 


The distance travelled in the water is a, sec 6, and in the air is az sec 62, and so the 
total time taken is 
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Fig. 2.3. Fermat’s deduction A2 


Aj 


a, sec 6; a a2 sec 62 


vi U2 
From the first equation, we find by differentiating that 


dé 

2 2 

sec’ 6, + az sec’ 62 — = 0. 
a ' 2 *d0; 


From the second equation, we deduce that 


dT a sin 0; sec? 6; 4 dy sin O sec” 0 dO> 
de, U1 v2 do, ; 


For the shortest time, we require that 


dT 
— =0. 
dé, 


dé 
Eliminating aa from these equations, we find that 
1 


0 a, sin 6; sec? 0; re dy sin 6) sec? 0} (= sec? 6, ) 
— > 


UI v2 a2 sec? 0 
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which simplifies to 
sin 0; _ Uy 


sin 02 v2 


The constant in Snell’s law is revealed to be the ratio of the velocities of light in the 
two media. 

So Fermat’s principle that light takes the least time to travel between two points led 
him to a theoretical derivation of Snell’s law—but on what basis? Can this principle 
really be fundamental, or is not the case that light behaves as it does for some (possibly 
unknown) reason and that this reason implies the principle? Can a principle such as 
least time act as a cause? These questions were not to be answered for a long time, 
and Bernoulli himself is our source for the information that® 


Leibniz in the Acta Eruditorum, 1682, pp. 285 et seq., and soon after the famous Huygens, 
in his Treatise on Light, p. 40, have demonstrated this more comprehensively and by most 
valid arguments, have established the physical, or better the metaphysical, principle which 
Fermat seems to have abandoned. 


2.3. The Bernoullis’ Brachistochrones 


Johann Bernoulli tackled the brachistochrone problem by what was to become a 
standard method in mechanical questions: replace the problem by a number of dis- 
crete problems and let these problems crowd together and their number increase 
indefinitely until they tend to the original problem of interest.’ 

In this case, Bernoulli considered the path of light through a sequence of horizontal 
layers of translucent material, each layer having a different density. As he put it, 
they are made of “a diaphanous matter of a certain density decreasing or increasing 
according to a certain law”. At each boundary, Snell’s law applies and so the path of 
light through these media can be determined. 

How he had this idea is not known, but it is clear enough that by adjusting the 
density of the diaphanous layers a wide variety of paths can be obtained, just as a 
varying law of acceleration can. So “In this way we can solve the problem for an 
arbitrary law of acceleration, since it is reduced to the determination of the path of a 
light ray through a medium of arbitrarily varying density”. 

Then by looking at an infinitesimal moment, Bernoulli deduced that if the moving 
particle goes from a point (x, y) toa point (x + dx, y + dy) and its velocity increases 
from v to v + dv then, by Snell’s law, 


dx 1 


dz a 


’ 


See Fregulia and Giaquinta [109], 40. 


7See Bernoulli [9]. There is an English translation of this paper and Jakob’s in Struik Source Book 
391-399. 
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for some constant a, where dz is the amount of motion along the tangent, so dx? + 
dy” = dz’. This gave him that 


and so, he said, the brachistochrone is a cycloid. 
Neither for him nor for us is this deduction entirely easy to make. We can write 


of) bs dy. 
oya—y 


Set y = asin’ t, so dy = 2acost sin tdt, and the integral becomes 


ee a. a 
x = 2a sin‘ tdt = —~sin2t + at = = (2t —sin2r). 
0 2 2 


We can write , 
sin? t = 5 — cos 2t) 


and so express the solution curve parametrically in the form 
2 (2t — sin 21) ta 21) 
= — (2t — sin = ~(1 —cos2r). 
aaa aa 


This is the equation of a cycloid that starts at the origin. Moreover, this cycloid 


ths et ae dx dy d’x 
has a vertical initial tangent, because when t = 0 — = 0 = — but —~ = 0 and 
; dt dt dt? 
d°y 


The constant a can be determined from the separation of the end points, and once 
that is given the cycloid is unique. 


Jakob Bernoulli’s solution was different and seems to have influenced Euler a 
generation later. He argued that if a path between points A and B is the path of 
quickest descent then it must be the path of quickest descent between any two of 
its points.® For, if it was not the path of quickest descent between two intermediate 
points C and D, say, then that path could be replaced with a quicker one and this 


8If the points A and B do not lie in a vertical line then the particle must start with some non-zero 
velocity. 
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would shorten the time of descent from A to B as well, which was assumed to be a 
minimum. 

He then formulated this insight as an infinitesimal statement about a piece of the 
quickest path as compared to some nearby path, drawing on Galileo’s laws of motion. 
By an argument here suppressed he arrived at the same integral, and therefore the 
same solution, as his brother. 


2.4 Geodesics on Surfaces 


A geodesic on a surface is a curve of the shortest length that joins two points on the 
surface and lies entirely in the surface. In the plane, a geodesic is a straight line; on 
the sphere, it is an arc of a great circle (the circle cut out by the plane that passes 
through the given points and the centre of the sphere). 

In 1697, Johann Bernoulli challenged the mathematical community to investigate 
geodesics on curved surfaces. His insight into the problem had to do with the best 
approximating plane to a curve. He considered a geodesic on a surface and passing 
through a point P, and he looked at two points P’ and P” on the geodesic that tend 
to P. He argued that the plane through these three points on the geodesic then tends, 
as these points tend together to the point P, to the plane containing the tangent to 
the geodesic at P that is perpendicular to the surface at P. 

How might we come to believe that? We might argue that if the geodesic is 
traversed at a constant speed then its normal is perpendicular to the curve, the normal 
and the tangent between them define the plane in question, and for a geodesic, the 
normal is also perpendicular to the surface. Or, we might argue that the claim is true 
for a sphere, so it is true for the best approximating sphere to the surface at P and 
because these surfaces are arbitrarily close in the limit what is true for the sphere 
is true for the surface. Of course, we need another argument for surfaces that are 
saddle-shaped near P. 

Bernoulli’s argument was something like the first of ours, but in reverse. From the 
original geometric insight, he deduced that the curvature vector is proportional to the 
normal vector at every point on a geodesic. Thus, he could interpret the requirement 
that the curvature vector of a curve in a surface be normal to the surface as an equation 
for a geodesic. Even so, this did not lead to a solution to the problem except in special 
cases, and the problem lay fallow for 30 years until Bernoulli proposed it to the young 
Leonhard Euler in 1728. 

It is worth noting that it follows from Johann Bernoulli’s characterisation of a 
geodesic that force-free motion along a surface is along a geodesic if force-free is 
taken to mean no forces acting on the surface (or, if you prefer, there are no forces 
acting that have a non-zero component in the tangent plane). 

Euler published a short paper on geodesics in 1732, in answer to a question from 
Johann Bernoulli. He considered a geodesic GMH on a surface, where the points are 
infinitely close together and M is the midpoint, and the plane through M parallel to 
the (y, z)-plane cuts the surface in the curve TMK. 
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Euler took Cartesian coordinates in space—one of the earliest times this had been 
done—and wrote down the distances GM and M H on the assumption that they were 
well approximated by infinitesimal line segments. He said that the coordinates of the 
points were 


G = (a,b,c), M=(a+a,y,Z), H = (a+ 2a, f, g), 


so, by the three-dimensional Pythagorean theorem, distances between the points are 


GM = Jo? + (y —b)? + (c—c)?, MH = Vo? 4+(f — y+ (g—2). 


For this to be a minimum as M varies the differential of GM + MAH must vanish, 
and this implies that 


(y—b)dy+(—c)dz  _—  (f —y)dy + (g —2z)dz 


= (2.1) 
Vet+(iy—bP+-c? Jo®+(f —y)?* +(e —2? 


It remained for Euler to eliminate the arbitrary infinitesimals a, y,z, f, and g. 
After some work, here omitted, he obtained a second-order ordinary differential 
equation that he was able to solve in simple cases, when the surface is a cylinder, a 
conoid (a cone on a plane curve), or a surface of rotation. 


2.5 Exercises 


1. Obtain a formula for the radius of curvature of a curve and show that for a curve 
traversed at unit speed the radius is the reciprocal of the magnitude of the accel- 
eration. 

2. Use Bernoulli’s insight in Sect. 2.4 to find the geodesics on a sphere, a cylinder, 
and a cone. 


Questions 


1. Information about tangents to a curve and its radii of curvature translate into 
information about velocity and acceleration along a curve. Why do you think this 
often struck mathematicians in the eighteenth and early nineteenth centuries as 
enough? 


Chapter 3 ®) 
The Vibrating String and the Partial sive 
Differential Calculus 


3.1 Introduction 


The study of problems involving more than one independent variable and the exten- 
sion of the calculus to deal with these problems were significant advances of the first 
half of the eighteenth century. The first significant success was d’ Alembert’s math- 
ematically correct description of the vibrating string, which has become famous 
as being the first partial differential equation to be solved, and although that title 
that can be disputed on a technicality, this should not be allowed to mask the real 
breakthrough his analysis achieved. Here we examine what he did to formulate and 
solve a problem in two independent variables and show how it enabled many basic 
phenomena of musical sounds to be explained.! 

In the case of two independent variables, d’ Alembert and Euler found it natural 
to introduce formal complex variables, but this raised questions they were unable to 
answer about the implications of using them.” 


3.2 Early Investigations into the Partial Differential 
Calculus 


The first person to extend the calculus systematically to two independent variables 
was Nicolaus I Bernoulli in unpublished work in 1719. He had been led to it through 


'This account overlaps with the accounts in Barrow-Green, Gray and Wilson [4] and Gray and 
Micallef forthcoming. 

71 offer a blanket warning that the dates of publications in the eighteenth century can be confusing. 
It was usual for a member of an Academy to present a paper, which would then be published in a 
journal of the Academy, but the process was slow, and it is common for a paper published two or 
three years after it was presented. Generally, I have referred to papers by their publication dates. 


© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 27 
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his work on a problem of contemporary interest, the determination of a family of 
orthogonal trajectories to a given family of curves. Each curve in the orthogonal 
trajectory is specified by a parameter a; the coordinates of a point on a given curve 
are functions of a parameter f. 

We know from unpublished manuscripts that Bernoulli deduced the equality of 
mixed partial derivatives from the following observation*: moving from an initial 
point ¢ on a curve with parameter «@ to the point ¢ + dt and then along the orthogonal 
trajectory to a point on the curve with parameter w + da is the same as first moving 
to the curve with the parameter ~ + da and then changing ¢ to t + dt. Euler argued 
much the same way independently in his “De differentiatione” of 1730, which he 
also left unpublished for several years. When he eventually did publish this result in 
another paper [72] Nicolaus wrote to Euler in 1743 to say that he had not published 
it himself because he regarded it as an axiom “which I thought to be obvious to 
anybody from the mere notion of differentials” .* 

The impetus to extend the calculus to functions of two variables had a second 
source in the study of ordinary differential equations, specifically when mathemati- 
cians were led to consider inexact differentials. These are expressions of the form 
a(x, y)dx + b(x, y)dy that cannot be written as d(g(x, y)). Alexis Claude Clairaut, 
in his paper [43] began by noting that if 


a(x, y)dx + b(x, y)dy = d(g(x, y)) 


then necessarily, by the equality of mixed partial derivatives, 7 = o and con- 
versely, or so he claimed, if this condition is met then the differential is exact. 
This is only true if the functions a(x, y) and b(x, y) are defined everywhere in a 
simply connected region, as the counter-example a(x, y) = x/(x* + y), b(x, y) = 
y/(x? + y*) shows; this differential is not exact. But Clairaut deduced the theory 
from a consideration of monomials of the form x” y” because he believed that every 
function of two variables is expressible as an infinite sum of such monomials.° 

When the differential is not exact he proposed to look for a factor u(x, y) such 
that 


U(x, y)a(x, y)dx + w(x, y)b(x, y)dy 
is exact, and this led him to the partial differential equation 


d(ua) _ d(ub) 
dy Ox 


or, equivalently, 


3Only in the nineteenth century did mathematicians rephrase this as a necessary and sufficient 
condition. 

4See Engelsman [68] for the details. The quote from Nicolaus I Bernoulli occurs on p. 106. 

See Clairaut [44], 45. Clairaut developed his ideas in competition with Alexis Fontaine. For a 


discussion of their rivalry, and the full context, which includes investigations into the shape of the 
Earth see Greenberg [129]. 
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Clairaut was able to give a number of ways of finding the integrating factor jz in 
particular cases. 

Clairaut had heard about Euler’s work on the subject from Daniel Bernoulli and 
wrote to Euler about it on 17 September 1740, enclosing copies of some of his 
papers. Euler wrote back on 19 October to say that he was very pleased with them, 
and this started a productive association between the two, a highlight of which is 
Clairaut’s analysis of the motion of the Moon that was one of the decisive papers in 
the Continental acceptance of Newtonian gravity in 1749.° 

All this work led to the emergence of partial differential equations as a topic 
of investigation. The first important problem to be solved in the theory of partial 
differential equations was the problem of the vibrating string, and historians place 
the emphasis here, rather than on the question of integrating factors, because it marks 
a real if intangible shift towards the full acceptance of two independent variables. 
Clairaut had seen the question of finding an integrating factor as a question about 
differentials and not as a question in the subject of partial differential equations; such 
a subject did not exist and he was not inspired to create one. 

But even the story of the wave equation tells us that a new field of enquiry was 
only gradually being born. However, natural it might seem for someone trying to 
create a general theory of second-order partial differential equations to take the wave 
equation as a major example, d’ Alembert never wrote down the wave equation when 
he studied the motion of the vibrating string in his [52]. Even at this stage, the problem 
of the vibrating string was a problem in the partial differential calculus rather than 
in the theory of partial differential equations, which was still to be created. 


3.3. D’Alembert: The Vibrating String and the Wave 
Equation 


The problem of the vibrating string had by then attracted the attention of mathemati- 
cians for over a century because of its close connections to music. Every musician 
knows that a violin string makes a predictable sound and that tightening the string 
raises its pitch, as does shortening it. In 1638, Marin Mersenne had stated this law 
for determining the frequency of vibration of a string: 


va SvT, 


where v denotes the frequency, ¢ the length, and T the tension in the string and o is a 
constant (determined by the nature of the string). However, neither he nor anyone he 


See the discussion in Sect. A.2. 
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consulted could explain why this rule should be true. Nor could Christiaan Huygens, 
some years later. The first person to get anywhere with it was Brook Taylor in 1713. 

Taylor came from a musical family—he played the harpsichord—and his cryptic 
paper (Taylor [251]), earned him the reputation of being the first person to derive 
Mersenne’s law mathematically. After writing up this theoretical account, he devised 
ingenious experiments designed to measure the rate at which harpsichord strings 
vibrate (they vibrate too fast for anyone to count). 

Taylor began his paper by making two simplifying assumptions. 


e The amplitude of oscillation of a string is independent of its frequency (volume is 
independent of pitch.) 

e The string vibrates in such a way that all of the string crosses the axis simultane- 
ously. 


The second, and far from plausible, assumption enabled Taylor to argue that each 
point of the string behaved like a simple pendulum, and that each point on it moves 
up and down with the same period. He then argued that the force at each point is 
determined by the curvature of the string at that point, and that it is equal to the force 
that would cause a simple pendulum to oscillate with the same period as the string. 
As aresult, Taylor was able to determine the shape of the string and its frequency of 
vibration, and to derive Mersenne’s law. 

In fact, Taylor’s second assumption is wrong for all but the simplest oscillations, 
nor does it follow that each point of the string behaves like a simple pendulum. Even 
so, his analysis was the accepted one until it was replaced by d’ Alembert’s account. 


3.3.1 D’Alembert’s Breakthrough 


In a paper written in 1747 and published in 1749, d’ Alembert (see Fig. 3.1) assumed 
that the string was of uniform thickness. He then wrote’: 
Let t be the time elapsed from the moment when the string started to vibrate: it is certain 
that the ordinate P M can only be expressed by a function of the time ¢ and of the abscissa 


or the corresponding arc s or AP. Let, therefore, PM = g(t, s), that is, let it be equal to an 
unknown function of ¢ and s.8 


So ind’ Alembert’s account, the height of the string above the x-axis at time f is given 
by a function g(t, s), where the variable s denotes arc length along the string. He 
then explicitly assumed that the vibrations of the string are so small that the length 
of the string from one point to another is “reasonably equal” to the difference in 
the x coordinates of the points. This made the mathematics tractable, at the price of 
considerably restricting the analysis. 

D’ Alembert set dg = pdt + qds, where p = “8 andg = oe He referred to Euler 
[72] for the equality of mixed partial derivatives, and stated that 


7See Alembert [52]. 
8 Quoted in Struik Source Book, 353. 
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Fig. 3.1 Jean le Rond 

d’ Alembert (1717-1783), 
Artist unknown, after 
Maurice Quentin de la Tour 


dp =adt+vds, dq =vdt+ Bds, (3.1) 
where 
ap ay Be ay 
ar’ atas’” as? 


He then followed Taylor, and argued on physical grounds that the acceleration of a 
point on the string is proportional to + fe where the sign is positive if the curve is 
concave towards the x-axis and negative if it is convex. 

Next, he argued that the acceleration at a point depends only on position, so 
5 = Bf (this uses the identification of string length with the x coordinate, so the 
oscillations must be very small). Then, by looking at the position of the string at 
two moments of time an amount df apart, he argued that a point will have moved an 
amount adt. He brought these two observations together by referring to Newton’s 
Principia, where Newton had discussed motion under gravity, to deduce that (with 
respect to a suitable choice of units)’: 


a= B. 


Had he transcribed his remark into the notation of second partial derivatives he 
would have written the wave equation, 


ve _ 29°” 
atzA” 


°Struik notes (Source Book 354, n. 5) that the reference is to Principia Book I, Sec, X, Prop. LII, 
where Newton reworked Huygens’s discussion of the pendulum, and that Taylor had done the same. 
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2 
where — is the second derivative of g with respect to t with s regarded as a constant, 


2 


and a4 is the second derivative of g with respect to s with ¢ regarded as a constant, 


Ss 
but evidently that is not how d’ Alembert thought of it. 
To solve the equations, 


dp +dq =(a+v)(dt + ds) and dp — dq = (a — v)(dt — ds), 
we note that they can be written in the form 
dp+dq =(a+v)d(t +s) and dp —dq = (a — v)d(t — s) 


when it becomes clear that p + q is a function of tf + s and p — gq is a function of 
t — s. d’ Alembert wrote 


p=O(t+s5)+ACt—s), andg= ®@(t+s)—A(t—s), 
and deduced that the general shape of the string was given by 
g=vt+s)+TC—s), 


where w and [ are arbitrary functions; g(t + 5) = w(t +s) +T;(@¢ —s) and A(t + 
s)=¥,(t +5) —T,(¢ — s). Here, the suffices denote differentiation with respect to 
tands. 

This was a dramatic moment: the first time that the most powerful branch of the 
calculus, that of differential equations, was shown to extend to problems with more 
than one independent variable.'° A path now seemed to be open to tackle the many 
problems in several variables in which the natural world would surely abound. 

But the success soon brought with it a profound disquiet. D’ Alembert’s solutions 
were anything of the form 


p(t, s) = f(ct+s)+g(ct—s), 


where f and g are arbitrary functions. Upon reflection, the functions f and g should 
be capable of being differentiated twice (and of course for each value of ¢ the graph 
of g depicts a string fastened down at each end, as the original problem requires). 
This solution is very general, which is as it should be, because the string can be 
released from any initial shape and with any initial velocity. As d’Alembert noted 


“this equation includes an infinity of curves”.!! 


10Rarlier, d’Alembert had reformulated Daniel Bernoulli’s study of the small oscillations of a 
hanging chain as a partial differential equation, but he had not been able to solve it. See D’ Alembert 
[50], 171. 


11 Quoted in Struik, Source Book, p. 355. 
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But just how general could such a solution curve be was a question that, once 
raised, was to become one of the most famous mathematical controversies of the 
century. For a further discussion of it, see any good history of mathematics in the 
eighteenth century. 

In his [53], d’ Alembert looked for solutions of the form 


g(t, s) = F(t) x G(s). 
This reduced his differential equation in two independent variables to two differential 


equations each in a single variable, as follows. 
If we substitute g(t, s) = F(t) x G(s) into the equation 


8p _ iy 
ar as?’ 

we obtain 
oF a’G 


a O) = oss FO, 


which implies that 
10°F 4,1 0G 


F) 0° G(s) as?" 


But a function of ¢ can only be equal to a function of s if they are both constant, 
which we shall call —k*c”. Then we obtain the ordinary differential equations 


a’ F aeoy d’G 
ae = —k*c F(t) and ae =-—-k G(s). 


The solutions to these equations are of the form: 
F(t) =coskct or sinkct, G(s) =cosks or sinks. 


Here, c is a constant determined by the material in the string, the tension in the string, 
and its shape, and k? is some constant, as yet undetermined, related to the frequency 
of the string’s vibration. (We can now see that only choosing a negative quantity for 
the above constant makes physical sense: if you follow through the above argument 
with +k?c? you see that hyperbolic functions are obtained that cannot match the 
presumed boundary conditions and are liable to grow impossibly large.) 

For the first time, recognisable solutions had appeared: functions of the form 
coskct cosks (or coskct sinks, and so on) are solutions of the wave equation, 
although by no means the most general. 

We shall shortly look ahead by three years to see how Euler also found that the 
wave equation has very general solutions. His solution is also important because it 
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is one of the first occasions where the equality of mixed partial derivatives was used 
and understood. But first, we attend to the physics of musical sounds. 


3.3.2 Mersenne’s Law and Modes 


First, it is clear that d’ Alembert’s new ideas led to the first satisfactory deduction of 
Mersenne’s law. We take the solution 


g(t, s) =coskct x sinks 


and look at a particular point on the string—that is, at the situation for a fixed value 
of s—then as time varies this point moves according to the equation 


gy = coskct x constant. 


This means that it oscillates with a frequency, kc, that is the same whatever point of 
the string is taken. So, a given string vibrates with a specific frequency. 

To see that the frequency k is inversely proportional to the length of the string, 
observe that both ends of the string must be fixed, so g = 0 when s = ¢. Therefore, 
sink€ = 0 so ké must be a multiple of 2, say k0 = Na. So k = Na/€ and the 
frequency is inversely proportional to the length of the string, as Mersenne had 
claimed. 

Musicians knew that halving a string results in a note an octave above the basic 
note (it doubles the frequency). This phenomenon of modes was explained in the 
manner of d’ Alembert by Euler in his [80], Sect. 41. The solutions 


y =cosmct/€ x sinas/¢ 
correspond to the choice k = m/€ (with N = 1) and 
gy = cos2mct/€ x sin27s/€ 


correspond to the choice k = 27/£(N = 2). In the second case the string, which has 
twice the frequency, behaves as though it were two strings each of half the original 
length and joined at a fixed point in the middle. This explains how the same string 
can be made to play certain different notes without being tightened or changed in 
length, and indeed that it will naturally vibrate in a variety of ways—but not in any 
way: the tones it can emit—its harmonics—are all the notes whose frequencies are 
multiples of a basic frequency. 

Another musical phenomenon, which had puzzled Mersenne, is that a string can 
emit several notes at once. But it is easy to show that if g(t, 5) and w(t, s) are 
solutions of the wave equation, then so is any sum of the form ag + bw where a and 
b are arbitrary constants. So a string may vibrate in two or more ways simultaneously, 
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emitting two or more different notes as it does so. Euler and Daniel Bernoulli had 
noticed the same behaviour in their analysis of the vibrating clamped rod in the mid- 
1730s. It makes it clear, also, that Taylor’s assumption that the whole string crosses 
the axis simultaneously must be wrong. 


3.4 Euler Rewrites the Wave Equation 


Although it will take us slightly ahead of the story, this is a possible place to look 
at how Euler treated the wave equation some years later, as he began to develop a 
theory of partial differential equations. 

In his [80], E213, Euler first rederived the equation of the vibrating string in a 


way that he felt led more simply to the solution. He took the wave equation in the 
form (4 -—¢? ce y = 0, factorised it as (4 + c+) (4 — c+) y =0, and argued 
(in Sect. 25 of his paper) that the equation of motion of the string can be regarded as 
a system of two first-order differential equations. 

More precisely, Euler considered (Sect. 25) a function y that satisfied this first- 


order equation 


a a 
Se ace (3.2) 
ot Ox 


and argued that therefore 


ay a (dy a (ay ay 
= =e =C = 
or? ~—s ot: \ Ot at \ Ox otax 


ay a (dy 59 (ay 28’y 
C = C =C =c — . 
Ox dt ax \ dt ax \ox ax? 


So a solution of Eq. (3.2) satisfies the wave equation 


ay _ iy 
or ax2" 


Now, if y = f(x + ct) let z = x + ct and write 
y= ff), 2=8(,t) =x+4+ct. 


Note that 


The chain rule for differentiation yields: 
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dy _dfog df 
— —* & 
ot dz ot dz 


dy dfdg df 
ax dzdx dz 


SO 
dy dy 
—_— = C— 
ot Ox 


Therefore, functions of the form y = f(x + cf) are solutions of the wave equation. 
By an analogous argument, so are functions of the form y = f(x — ct). Euler fin- 
ished off by checking explicitly that functions of this form are solutions of the wave 
equation. 


3.5 Formal Complex Methods 


As we have seen, there was a close connection between certain partial differential 
equations and certain exact differentials that mathematicians such as d’ Alembert, 
and Euler were learning to exploit. A major example of this method was published 
by d’Alembert in 1752 in another context. He began with two differentials similar to 
the ones above, imposed an exactness condition, and was led to a different constraint 
on a collection of functions. Although the only change in the differentials was one 
change of sign, the consequence was the arrival of what may be called formal complex 
methods in the theory of differential equations. 

D’Alembert entered the manuscript of his Essai d’une nouvelle théorie de la 
résistance des fluides for a prize competition of the Berlin Academy, and when 
no prize was awarded he blamed Euler and their already poor relations worsened.!” 
D’ Alembert reworked the manuscript and published it as a book [54], which is where 
the differential equations of hydrodynamics were first written down. 

In Chap. IV, Sect. 45 d’ Alembert considered a two-dimensional flow in the (x, z)- 
plane. The general problem was very difficult, and so he turned (in Sects. 57—60) to 
address the simpler task of finding the conditions on functions M and N of x and z 
such that the differentials 


Mdx + Ndz, Ndx — Mdz (3.3) 
are exact. This resembles his account of the wave equation, but with a change of sign: 


M =a, N = v and with B = —M. As we shall see, this analogy was to be much 
appreciated by Euler. 


They only improved until 1759 when d’Alembert declined Frederick the Great’s invitation to 
become President of the Berlin Academy and recommended Euler instead; neither was appointed 
and Euler soon left for St. Petersburg. 
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D’ Alembert then argued that if the differentials are exact then there are functions 
p and q such that 


Mdx+Ndz=dq, Ndx — Mdz=dp. 
Therefore, the differentials 
(M +iN)(dx —idz) and (M —iN)(dx +idz) 


are exact, and, on putting du = dx —idz and dt = dx +idz, M+iN =o and 
M —iN =T, the expressions odu and tdt become exact differentials. Therefore, 
o = M+ iN must bea function of vu = x —iz,andt = M —iN must bea function 
of t = x + iz. This allowed him to “deduce the values of M and N”’, as he put it— 


they are 
1 


1 
M=- dN=— _ . 
5 OEE) an Ad T) 

D’ Alembert then gave what he called a simpler argument to the same effect. The 
definitions of p and q imply that 


Pz = —qx and q; = Px, 


and he deduced immediately that gdx + pdz and pdx — qdz are exact differentials 
and that therefore g + ip is a function of x — iz and g —ip is a function of x + 
iz. (This is correct because (g + ip)(dx — idz) = qdx + pdz+i(pdx — qdz) is 
exact.) So, he set g + ip = F(x — iz) and q — ip = G(x + iz) and separated out 
the corresponding expressions for p and q. 

Now he had to show that this line of argument led to real-valued functions because 
his problem was connected to real-valued functions in the plane. So he said that for 
p and gq to be real g must be a function of the form 


E(x — iz) + i¢(x +iz) +&(x + iz) —ig(x — iz), 


where the functions € and ¢ (regarded as power series, something d’ Alembert always 
thought possible) have real coefficients (he omitted the similar result for p). 

So d’Alembert obtained expressions for p and g in which, in his phrase, the 
imaginary quantities destroy themselves. Later eyes can see that d’Alembert had 
sketched a quick argument from the Cauchy—Riemann equations to the existence of 
formal complex functions, but d’ Alembert did not; indeed, he made no further use of 
those equations in the rest of the memoir. Instead, he returned to his original problem 
and solved it by power series methods. 

However, d’ Alembert’s work was very influential; there are many later references 
to “the method of d’Alembert” in the study of surfaces. Formal complex methods 
rely on a free transition from real to complex variables and functions, which are then 
handled by algebra and differentiation, but with no appreciation of what it is for a 
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function to be complex differentiable. For example, (x — iz) and (x + iz) above 
are complex conjugates, as are €(x + iz) and ¢(x — iz), andsoq is the real part of the 
complex-valued function (x — iz) +i¢(« +iz), and qg is a harmonic function of x 
and z—but none of this was remarked upon by d’ Alembert. The method provided a 
convenient notation, at the cost of requiring that any imaginary quantities could be 
made to vanish in the end, and leave only equations between purely real quantities. 


3.6 Exercises 


1. Find the equation of a plucked string released from an initial position formed by 
a straight line joining the point (x, t) = (0, 0) to (37/4, 1) and a straight line 
joining (37/4, 1) to (z, 0). 

2. Either write a programme to show the motion of the string or find one on the web. 
How would you describe the motion of the string? How does it produce a sound? 


Questions 


1. In the absence of any clear distinction at the time between continuous, differ- 
entiable, and analytic, what could Euler have meant by a curve drawn by a free 
motion of the hand? 

2. What sorts of solutions might Euler admit to the partial differential equation 


ot 02 
Ox 
3. It was quickly appreciated that if a single term of the form coskct sinks is a 
solution of the wave equation then so is a sum of terms of this form, and even an 
infinite sum of terms of this form. This resulted in a number of ‘Fourier series’ 
being produced before Fourier, and speculation about whether every function 
can be written in this form. Daniel Bernoulli suggested that might be true, Euler 
disagreed, and Lagrange got close to providing a plausibility argument for the 
claim before retreating. Information about this debate can be found in Bottazzini 
[21] and in the commentaries on Euler’s works. Given the range of functions 
known to them, what do you think an eighteenth-century mathematician might 
say was involved in deciding this claim? 


Chapter 4 ®) 
Rational Mechanics Betis 


4.1 Introduction 


The study of many natural phenomena was opened up by the use of partial differential 
calculus. In particular, in the late 1750s, Euler was able to extend his methods to pro- 
duce equations for the motion of an ideal fluid. He also studied the partial differential 
equations that describe the propagation of sound. This led him to a partial differential 
equation of lasting importance for the theory, and in the course of attempting to solve 
it, he also came up with an ordinary differential equation that was to be the most 
important and thoroughly analysed equation of its type in the nineteenth century 
(as we shall see in Chaps. 11 and 16). He also gave the first general formulation of 
rigid body mechanics, which rewrote and extended Newton’s ideas and put them into 
something like their modern form. 


4.2 Fluid Mechanics 


The English word ‘hydrodynamics’ is derived from the title of a book Daniel 
Bernoulli published in 1738, his Hydrodynamica, which is in turn a word that he 
coined. His topic was an old one, the flow of water from a vessel through a pipe, 
and his advance was to determine how the pressure on the walls of a container of a 
volume of fluid in a container is affected by the velocity of the water. One presumably 
unfortunate result of his success was that his competitive father Johann Bernoulli 
then not only wrote and published his Hydraulica in 1742 but sought to pretend that 
it had been written in 1732, thus claiming priority over Daniel’s work. 

Much more important investigations were soon made: D’Alembert’s Traité de 
dynamique [50], and in particular his Réflexions sur la cause générale des vents [51] 
which outlined a new theory of the tides, and of course his Essai d’une nouvelle 
théorie de la résistance des fluides, and Clairaut’s memoir Théorie de la figure de 
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Fig. 4.1 Leonhard Euler 
(1707-1783) by Jakob 
Emanuel Handmann, 1756 


la terre (Theory of the shape of the Earth) [44] which discussed the shape of the 
Earth, regarded as a rotating fluid mass. Even so, it was Euler’s work on fluids that 
has become the basis of all subsequent mathematical discussions of the motion of 
fluids. 

Euler (Fig. 4.1) published three memoirs on the subject in 1757 (E225, 226, 227) 
and a further paper (E258) in 1761. They demonstrate very clearly both the power 
of the new methods and the difficulties that have to be overcome. He began E225 by 
noting that there was a consensus that the different behaviour of solids and liquids 
must be explained by expressing clearly the essential difference between the two. 
But, he said, this difference had never previously been understood. He proposed 
that it consisted in the fact that a solid can be held in equilibrium by two equal and 
opposite forces, whereas a fluid is only in equilibrium if it is held in place by an equal 
force at every point of its surface that acts perpendicular to the surface (he explicitly 
assumed that no forces are acting inside the fluid). 

On this foundation, he derived the equations of motion for a perfect fluid, one 
that is incompressible and inviscid (without viscosity or ‘stickiness’). Euler was 
particularly pleased to derive a theory of fluids based on the idea that a fluid is 
composed of infinitesimal solid bodies because this extended his version of Newton’s 
mechanics to fluids. 

In an incompressible fluid Euler’s principle implies that the pressure in a body of 
fluid in equilibrium is known when it is known at a single point, and indeed that the 
pressure at a point depends only on the depth of the point.' 

Euler analysed a fluid by introducing mutually perpendicular axes OA, OB, OC. 
This was his standard approach to all questions in dynamics. He resolved the force 


'Tn later papers, he noted the changes that have to be made if the fluid is elastic or compressible. 
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of gravity acting at a point Z in the fluid along these three axes as follows: the 
component of the force in the directions OA, OB, and OC through Z he called P, 
Q, and R, respectively. He regarded P, Q, and R as functions of x, y, and z. 

He next considered an infinitesimal volume of fluid in the form of a parallelepiped 
with one vertex at Z and of size dxdydz. He took the density of the fluid to be q, 
so the forces acting on the parallelepiped were written down as Pgdxydz in the 
direction ZL = OA, Qqdxdydz in the direction ZM = OB, and Rqdxdydz in the 
direction ZN = OC. The pressure of the fluid above the parallelepiped, which Euler 
supposed to form a column of height p, is the other force involved. 

Euler then wrote” 

dp = Ldx + Mdy + Ndz, 


where, accordingly, 


OL OM OL ON OM _ ON 
Oy Ox’ Oz Ox’ Az dy’ 


He then considered the change in pressure between opposite sides of the paral- 
lelepiped and deduced that 


L=Pq, M=Qq, N= Rq. 


Therefore, 
dp = q(Pdx + Qdy + Rdz). 


Because the left-hand side can be integrated, Euler deduced from this that the right- 
hand side can also be integrated, and so (following some earlier remarks by Clairaut) 


OPq _OQq OPq _ORq O0Qq _ ORq 
Oy Ox’ a Ox’ Oz dy’ 


Much now depends on whether the fluid has a constant density or if the density 
varies with the depth. The analysis became complicated and so Euler turned to the 
study of particular cases, such as the theory of the barometer. Here, he observed that 
a wind must arise whenever the heat at equal heights is different and that a study of 
the equilibrium figures of a fluid suggests that some of them might approximate the 
shape of a planet. 


Euler returned to his general analysis in his next paper, E226. Here, he considered 
the motion of an infinitesimal cube in the fluid. It will be helpful to sketch his 
argument initially without the mathematical details. 


dL . OL... : : oe ‘ 
Euler wrote (=) where we have written Ox ; in this period, partial derivatives were often written 
x x 


as ordinary derivatives enclosed in round brackets. 
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In an infinitesimal moment of time, the infinitesimal cube is stretched or squashed 
in the directions of the axes, although its volume cannot change because the fluid is 
assumed to be incompressible. Euler considered that what drives the motion is the 
difference in pressure between each pair of faces of the infinitesimal cube. Consider, 
for example, what is involved in saying that an infinitesimal cube does not sink 
under gravity. The pressure on the bottom face differs from the pressure on the top 
face by an amount equal to the weight of the liquid in the cube, which is equal to 
the volume of the cube times the density of the liquid. If these quantities were not 
equal, the pressure difference would cause the cube to move. This pressure difference 
manifests itself as a force, and this will bring about a change in the velocities of each 
particle of the fluid. 

If we put all this together, we expect to find equations, one in each of the x-, y-, 
and z-directions, that say that a pressure difference on a small cube of fluid is equal 
to an acceleration in that direction multiplied by the mass of the cube. We indeed 
expect to see that the acceleration in the fluid will be described as the acceleration 
at each point and some measure of the stretching and squashing of the cube. This is 
exactly what Euler found. 

Euler supposed that at a point in the fluid with coordinates (x, y, z) the velocity 
was (u, v, w), where each of u, v, and w is a function of x, y, x and the time r. Ata 
nearby point with coordinates (x + dx, y + dy, z+ dz) the velocity is given by 


in the x-direction, and by similar expressions for the velocities in the y- and z- 
directions. 

As for the pressure differences, if p is the pressure, the difference in pressure 
across the faces separated by a distance dx is dx. 

The differences in velocities are of two kinds. The point originally at (x, y, z) has 
moved to (x + udt, y+ vdt, z+ wdt), and each of u, v, and w has changed. For 
example, by standard Taylor series arguments, the change in u is from u to 


aes Bees oe reese! 
u+—dt=u 
dt Ox dt Oy dt Oz dt 


dt, 


with similar expressions for v and w. Note that dx/dt = u, etc., so the increase in u 
in a time dt is 


pk pt ay 
Ox Oy Oz 

To obtain the equations of motion, Euler interpreted the rule that force equals 
mass times acceleration as saying that acceleration equals force divided by mass, 
and took the mass of an infinitesimal cuboid to be its volume times its density. He 
assumed that the density, p, was constant, which is a good approximation for water 
in many situations. 
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When all this is put together, the result (see E226, p. 286) is these three equations: 


On: ee. Ou ig Ou 1 Op (4.1) 
v = . 
at Ox Oy ” Oz q Ox 
Ov Ov Ov Ov 10p 
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These are the differential equations for the motion of the fluid: the minus signs 
arise because we (with Euler) are measuring gravity in the opposite direction to the 
corresponding axis and because pressure increases as depth increases. 

There is also an equation stating that the volume of any part of the fluid does not 
change, which had been known a decade before to d’ Alembert, in a paper Euler had 
read. Euler explained that better, and expressed the conclusion more elegantly, in 
his “Principia motus fluidorum” (Principles of the motion of fluids, E258), which he 
published in 1761. He now considered the motion of a particle of the fluid infinites- 
imally close to (x, y, z), and considered what would happen to an infinitesimally 
small pyramid of the fluid in an interval of time df. It is moved to another infinites- 
imally small pyramid of the same volume, and by calculating these volumes and 
equating them Euler deduced that in the flow at any instant 


OH a 7Y 3 OMG 
dx Oy dz 


This is known as the continuity equation for the flow of an incompressible liquid in 
space, and it expresses the idea that as the fluid flows it does not change its volume. 

Euler returned to the subject in a paper (E258) he published in 1761 as part of a 
long investigation into the motion of fluids, and in this paper, the Laplace equation 
was written down for the first time.* Here, he again considered incompressible fluids 
of constant density in two or three dimensions. In two dimensions, he wrote the 
components of velocity of the fluid at a point (u, v), and by calculating the volumes 
of infinitesimal elements he deduced that for an incompressible fluid u, + vy = 0. 
He then claimed that udx + vdy is exact (which it is only if the flow is irrotational, 
in later terminology) and defined S as its integral, writing 


dS = udx + vdy + Udt. 


From the continuity condition, he deduced that vdx — udy is another complete dif- 
ferential.* 


3How it acquired Laplace’s name is a long story often told elsewhere. 


‘This is only true if the fluid is, in today’s language, irrotational; Euler seems to have assumed this 
in this early work. 
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He then repeated this argument in three dimensions, found that udx + vdy + wdz 
is exact, and introduced? 


dS = udx + vdy + wdz + Udt. 
In paragraph 67 he deduced that 
Sixx + Syy + Szz = 0. 


He then wrote “Since it is not obvious how in general this can be made to happen, 
I shall consider certain classes of possibilities”, and found, as his first example, that 
(ax + by + cz)" will work for any n, provided a? + b? + c? = 0 if n > 1. Linear 
combinations of these will also work, and he wrote down all expressions of degree 
less than 6. Then he went back to his real business—the motion of fluids. 

It does not seem that he went back to the two-dimensional case and deduced 
that powers of x tiy are harmonic, or that he made use of this obvious deduc- 
tion elsewhere in his work. Most likely it became part of the general education of 
mathematicians of the next generation without its significance being appreciated. 


However, it is one thing to have some equations of motion, and another to solve 
them. A few minutes looking at water in a bath or at the wind makes it clear that even 
the simple fluids that Euler described can display very complicated motions, and in 
general Euler was able to deal only with special cases, although he did become the 
first person to describe motion in vortices in mathematical terms. Perhaps the best 
way that we can indicate the difficulties in this branch of mathematics is to point 
out that Euler’s equations of motion for a perfect fluid are still far from being ade- 
quately understood, and the problems raised by the equations for a general fluid (the 
so-called Navier-Stokes equations) are among the “millennium problems” whose 
solutions could earn a mathematician a million-dollar prize from the Clay Mathe- 
matics Institute.° 

The fact is that Euler was exceedingly prescient when he wrote, at the start of 
E226, that’: 


Having established in my previous Memoir the principles of fluid equilibrium in their most 
general form, regarding both the diverse nature of fluids and the forces that act upon them, 
I now propose to deal with the motion of fluids in the same way and to seek out the general 
principles on which the entire science of fluid motion is based. It will readily be understood 
that this is a much more difficult undertaking and involves studies of incomparably greater 
depth. Nevertheless, I hope to arrive at an equally successful conclusion, so that, if difficulties 
remain, they will pertain not to Mechanics but purely to Analysis, this science not yet having 
been brought to the degree of perfection necessary to develop analytical equations that 
embody the principles of fluid motion. 


>The function S was later called the velocity potential by Helmholtz. 
®See Carlson et al. [32]. 


7See Frisch’s translation in the Euler Archive. He notes that Euler wrote “formules” where he has 
supplied “analytical equations”. 
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4.2.1 Recent Discoveries About the Euler Equations 


I cannot resist quoting from an astonishing passage in Cédric Villani’s [261] Birth 
of a Theorem (2016, 91-93): 


Imagine you’re walking through the woods on a peaceful summer’s afternoon. You pause at 
the edge of a pond. Everything is perfectly calm, not the slightest breeze. 


Suddenly the surface of the pond becomes agitated, as though seized by convulsions; a few 
moments later, it is sucked down into a roaring whirlpool. And then, a few moments after 
that, everything is calm once more. Still not a breath of air, not even a ripple on the surface 
from a fish swimming beneath it. So what happened? 


The Scheffer-Shnirelman paradox, surely the most astonishing result in all of fluid mechan- 
ics, proved that such a monstrosity is possible, at least in the mathematical world. 


[...] It rests on the incompressible Euler equations, the oldest of all partial differential 
equations, used by mathematicians and physicists everywhere to describe a perfectly incom- 
pressible fluid without any internal friction. It has been more than two hundred fifty years 
since Euler derived his fundamental equations, and yet not all of their mysteries have been 
penetrated. Indeed, they are still considered to mark out one of the most treacherous regions 
of the mathematical world. When the Clay Mathematics Institute set seven ‘millennium 
problems’ in 2000, offering $1 million apiece for their solution, it did not hesitate to include 
the regularity of solutions to the Navier-Stokes equations. It was very careful, however, to 
avoid any mention of Euler’s equations — a far greater and more terrifying beast. 


And yet at first glance Euler’s equations seem so simple, so innocent, utterly devoid of 
guile or cunning. No need to model variations in density or to grapple with the enigmas of 
viscosity. One has only to write down the classical laws of conservation: conservation of 
mass, quantity of motion, and energy. 


But then . . . suddenly, in 1993, Scheffer showed that Euler’s equations in the plane are 
consistent with the spontaneous creation of energy! Thanks to [several subsequent authors] 
we now realise that even less is known about Euler’s equations than we thought. 


And what we thought we knew wasn’t much to begin with. 


4.3 Euler and the Propagation of Sound 


In the mid-eighteenth century, the nature and the propagation of sound were poorly 
understood. Newton had written about it in the Principia, and Euler in a study of 
heat, but d’ Alembert had dismissed both in his Traité des fluides Sect. 219, writing 
that 


The formula given without proof by Euler is very different from Newton’s, and I do not know 
how he was led to it; as for Newton’s formula, it is proved in the Principia but in perhaps 
the most obscure and difficult part of that work. 


So it was ambitious of the young Lagrange to write as one of his first works a 
112-page memoir on the subject ([169], from which the above quote was taken).® 


8This work has an interesting account of the ideas of Euler, D’Alembert, and Daniel Bernoulli on 
the nature of solutions to the equation for the vibrating string. 
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His work in turn provoked Euler to return to the subject, and in his [92], he first 
observed that Newton’s account was ingenious reasoning based on purely arbitrary 
hypotheses. Then he wrote that? 


All those who have dealt with this matter after Newton either have fallen into the same 
trap, or, wanting to delve into the true movement of the air, have rushed into intractable 
calculations, from which one could absolutely not draw any conclusions, and I must admit 
that I arrived at one or the other place whenever I undertook this research. I was therefore 
pleasantly surprised when I saw in this excellent book that I have just mentioned, that Mr. De 
La Grange has happily overcome all these difficulties, and that by calculations which could 
seem quite unintelligible. This is unquestionably one of the most important discoveries we 
have made for a long time in Mathematics, and one which may lead us to many others. 

In examining these prodigious calculations, I wondered at first if it would not be possible to 
achieve the same goal by an easier route, and after some effort I got there. I have therefore the 
honor to explain here the method that seems the most suitable for this study, but, as simple 
as it may appear, I must insist that it would not have occurred to me, if I had not seen the 
ingenious analysis of M. De La Grange. 


None of which meant that the derivation was simple (and because it is difficult it 
is omitted here). The upshot was a system of three second-order partial differential 
equations not unlike the equation for the vibrating string that described infinitesimally 
small motions of the air that described the passage of a sound wave. This is a lateral 
wave—the particles of the air move in the direction of the sound (unlike the vibrating 
string, which oscillates transversely). Euler obtained these equations in his ([93], 
Sect. 43). If the air is homogeneous and the sound travels radially at the same speed 
in all directions, then with respect to coordinates centred at the source of the sound 
the displacement (x, y, z) of a particle at (X, Y, Z) is given by 


x=Xs, y=Ys, z=Zs, 


where s is a function of the time f¢ and the radial distance V = /X?2+4+ Y?+4+ Z?.In 
this case, Euler’s equations reduce to ([93], Sect. 45) 


1 &s 40s @&s 
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where 2¢ is the acceleration due to gravity in Euler’s units and h is a measure of the 
elasticity of the air. 

Mathematically, the constant term 2gh can be absorbed by changing the time 
variable to ./2ght, at the cost of changing z to Ath This produces an equation of 
the form 


Os Os Os 


A further change of variable will write this as 


°Translation slightly modified from that of Ian Bruce in the Euler Archive. 
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which is propitious. 
This is the form in which Euler studied it in his Institutionum Calculi Integralis 
vol. 3, Part 2, Sect. 4 (E385, 1770, Sect. 322),!9 Euler wrote it as 
Oz 
Oxdy 
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He regarded x + y as a new variable, and looked for a solution in the form v = 
(x + y) F(x). On this assumption, the method of undetermined coefficients allowed 
Euler to calculate F(x) as a power series, and he deduced a recurrence relation: 
n+2m+-—rA=0 
(n+ 2m\+2m+ + d)B+(m+r)A=0 
(n+2m\+4m +27 4+3A+2)C + (m+A+1B=0 


(n+2m+ 6m +7 +5A+6)D4+ (m+rA+2)C =0 


from which he deduced B,C, D.... If we write A=bo, B=b,,C=)p,. 
Euler’s conclusion was that 


ory 


= m+A+j-1 
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But Euler also noticed that there are values where the series breaks off. These occur 
when j is an integer and m + + j = 0, which can occur when ‘ —m—n+m’ is 
a square. 

The same can be done with y instead of x, and so Euler declared that the general 
solution was of the form 


Yee ao he" oO: 
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where, for example, f‘/’ denotes the jth derivative of f, which Euler was confident 
was a complete solution because it contained two arbitrary functions, f(x) and g(y). 

Darboux’s clear analysis of Euler’s equation (4.4) and its solutions will be found 
at the end of this chapter. 


'0A similar equation for the propagation of sound was studied by Laplace, with less success, in a 
paper he presented to the Académie des Sciences in 1773 but which was published only in 1777. 
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4.4 Euler’s Vision of Mechanics 


In his two-volume Mechanica [70] Euler discussed the motion of point masses under 
forces and in resisting media (vol. 1), and their motion on surfaces (vol. 2). Signifi- 
cantly, the book is written throughout in the language of the (Leibnizian) calculus. He 
also sketched a plan for describing the motion of solid rigid masses, elastic bodies, 
fluids, and gases. Much of this was an unknown territory at the time and over the 
decades Euler’s contributions to various parts of this programme greatly enlarged 
the reach of mathematics. 

Let us observe in passing that in volume 2 of the Mechanica (p. 464, Sect. 832) 


Euler mentioned what may be one of the first partial differential equations'!: 


Finally I have turned or rounded surfaces [of revolution], which are generated by the rotation 
of any curve about an axis ; if AX were such an axis, on putting x constant, the equation 
between y and z gives a circle with centre P. Whereby the equation for these has this form: 


dz = Pdx — » ay or zdz+ydy = zPdx, 
z 


where Pz only depends on x; or 


with X present as a function of x. 


The partial differential equation appears here as a differential of the form Pdx = 
Rdz + Qdy that is to be integrated. 

Euler dealt with rigid bodies in his ‘Découverte d’un nouveau principe de 
Mécanique’ ([79], E177). In it, he gave a decisive reformulation of the theory of 
mechanics that brought it into line with the practice of the calculus as he understood 
it. Truesdell, in his An idiot’s fugitive essays in science ((259], 317) called the paper 
“a great masterpiece”, and correctly observed that “it has dominated the mechanics 
of extended bodies ever since”. He went on 


This paper contains the first proposal of the so-called Newton’s equations, f = ma in rect- 
angular Cartesian coordinates, as a “new principle of mechanics”, the common origin of all 
the several other principles then in use. 


Euler’s plan for the paper began with a definition of a solid body as one whose 
parts do not move with respect to each other (unlike, say, a liquid). He then said that 
existing principles of mechanics would show that at any instant the motion of a solid 
body can be analysed in terms of the motion of its centre of gravity and the rotation 
of the solid around an axis through the centre of gravity. This would be done by 
showing how the forces on the body determine how the centre of gravity will move. 
However, new principles would be needed to understand the rotation, which is about 
a varying axis. A start could be made by analysing rotations about a fixed axis, but it 


'l Translation by Ian Bruce in the Euler Archive. 
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would be necessary to consider axes of rotation that do not pass through the centre 
of gravity of the body. 

The new principle, upon which he proposed to base all of mechanics, should, he 
said, be derived 


from first principles, or rather axioms, on which all the theory of motion is based. The 
axioms relate to infinitely small bodies that can only have a progressive motion; and all 
other principles of motion must be deduced from these, those which serve to determine the 
motion of solids as well as of fluids; all other principles will be nothing but the application 
of these axioms in various ways.!* 


As he remarked, there were several such principles in use, and he proposed to derive 
them all from the new principle that he now put forward. 

He began by considering an infinitely small body of mass M acted upon by some 
forces. Its motion can be described with respect to a fixed but arbitrary plane and 
considering the height, x of the point mass above this plane. The forces acting on 
the point mass in various directions can be expressed in terms of forces parallel to 
the plane and forces perpendicular to it; Euler let P be the force perpendicular to the 
plane. After a time dt, the point mass will be a distance x + dx from the plane, 


and taking the element of time dt as constant, it will be the case that 2Mddx = +Pdt?, 
according as the force P tends to move the body away from or towards the plane. It is this 
single formula that contains all the principles of mechanics. !% 


Euler employed a system of units in which the quantity M is measured in units such 
that the point mass has a weight of M near the surface of the Earth, and accordingly 
the force P is then the weight of the body. If the body moves away from the plane 
with a speed dx /dt, and if this is the speed that it would acquire by falling through 
a height of h, then one has 


dx\? h d dt dx 
—|= and so =—. 
dt , Vh 


Euler next supposed that the motion of the point mass was measured with respect 
to three mutually perpendicular planes, and supposed that the forces acting perpen- 
dicularly to these planes were P, Q, and R, respectively. He then wrote down these 
equations of motion: 


2Mddx = Pdt?; 2Mddy = Qdt*; 2Mddz = Rdt’. 
This is the first time Newton’s equations of motion were expressed in the formalism 


of the calculus. Moreover, they have been expressed with respect to three mutually 
perpendicular but otherwise arbitrary axes—taking perpendicular axes as standard 


!2See Euler [79], 194. 


'3See Euler [79], 195. Note that Euler’s conventions about units produce factors of 2 in formulas 
where our conventions do not. 
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came in with Euler, not, for example, Descartes. There is another important differ- 
ence between Euler’s formulation and Newton’s: Newton spoke of bodies, Euler of 
infinitesimal elements out of which bodies are formed. 

As Euler then noted, if no forces act then P = 0, Q = 0, R = 0 and so the above 
differential equations can be integrated and the point mass is shown to move in a 
straight line. This establishes that a body initially at rest remains at rest, and one 
initially in motion remains moving uniformly in the same direction unless it is acted 
upon by a force.'4 

Next, Euler analysed the motion of a body whose centre of gravity is fixed, and 
by a more complicated argument of the same kind as before he deduced that at any 
instant the body is rotating about an axis through the centre of gravity. 

He then set about describing the motion in general. He supposed that there are 
three mutually perpendicular axes in the body (OA, OB, and OC) that meet at the 
centre of gravity O. (Unlike the earlier choice of coordinates, which were taken 
with respect to three fixed planes in space, these axes are fixed in the body and so 
are moving.) To deal with the fact that the axis about which the body is rotating 
itself changes with time, Euler showed that it is enough to know how the three axes 
OA, OB, and OC change with time, which depends on the shape of the body and 
the distribution of mass within it. However, although Euler was able to obtain the 
differential equations of motion, even he found them “too long’, and he concluded 
this paper with a discussion of some special cases. 

More than 10 years later, however, Euler returned to this question and showed 
in his book the Theoria motus corporum solidorum seu rigidorum (Theory of the 
motion of solid or rigid bodies) (E289) of 1765 that every rigid body has a set of 
axes with respect to which its behaviour is particularly simple.!° 

First, he gave a new account of how a body rotates about a fixed but arbitrary axis 
through its centre of gravity in terms of what are called today the “Euler angles’ of 
a rotation. Then he introduced the concept of the principal axes of rotation of the 
body. This is the crucial breakthrough that extended Newtonian mechanics from the 
study of point masses to arbitrary bodies—everything from the bones in our bodies 
to car chasses and orbiting satellites. 


Euler began by describing how to describe and quantify the motion of a rotating 
body. This led him to define 


Sect. 422. The moment of inertia of a body with respect to some axis is the sum of all the 
products which arise, if the individual elements of the body are multiplied by the square of 
their distances from the axis. 


The moment of inertia is an integral of terms dM and r? that are always positive, and 
so it is necessarily positive. Moreover, it can be calculated with respect to any axis, 
not merely an axis about which one might suppose that the body is ‘really’ rotating. 


'4Fuler here seems to have shared a naive belief that rest and motion of any kind are somehow 
different, but the separation of rest from uniform motion here may have been a pedagogical position. 
'SOur account concentrates on Vol. 1, Chap. 5. There is an English translation of much of the book 
by Ian Bruce in the Euler Archive. 
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To calculate the motion of a body from the forces acting upon it, Euler proposed 
that one looks for the most appropriate set of axes to use for a given body. First, 
he said that one should take the axis 7G through the centre of inertia that yields 
a minimum for the moment of inertia among all such axes. Then one should find 
orthogonal axes through the centre of inertia, 7, and specifically for axes about which 
the moment of inertia is a minimum or a maximum. This is a calculus problem that 
leads to a cubic equation, which must have either one real root or three. However, 
Euler was unable to deduce from the equation itself that there are always three real 
roots, and only gave an obscure argument to support the claim that there are three 
real roots. Finally, Euler proclaimed that every rigid body has three axes mutually at 
right angles with respect to which the moments of inertia are either a maximum or a 
minimum. These he called the principal axes 


446. The principal axes of any body are these three axes passing through the centre of 
inertia of this body, with respect to which the moments of inertia are either a maximum or a 
minimum. 


Euler then showed how to analyse the motion of a rotating solid body in terms 
of its motion with respect to the principal axes, how the action of forces affects the 
motion, and how to solve many problems in the dynamics of rigid bodies. 


4.5 Darboux’s Account 


Euler’s study of Eq. (4.4), the equation for the passage of sound, was nicely illustrated 
by Gaston Darboux over a century later in his ((58], Vol. 2, Chap. 3). He wrote the 
equation in the form 


Oz m Oz n O Pp 


Ox dy ey On” Hap Oy eae 


and noted that the substitution z = 0(x, y)(x — y)® turns it into one of the same form 
for @ and in which 


m=mta, n=n+a, p)=pta?+a(im4n-l). 


So it is possible to choose a value of a that makes p’ = 0 and write the equation in 
the form 


Oz B Oz B Oz 
= 0. 4. 
OxOy x—-—yOx x-—ydy : ee) 


Routine calculations show that if Z(G, 2’) = Z(G, 2')(x, y) is a solution, then so 
Z—p t= pivea. 
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If we set t = y/x, and z = x(t) then z is a solution of Eq. (4.6) if and only if y(t) 
satisfies the ordinary differential equation 


t1—ny"() + -A-B- A -A-B)N)v'O + AG) = 9, 


which is a differential equation that later became known as the hypergeometric 
equation and is arguably the most important ordinary differential equation in the 
history of mathematics.!° 
The canonical form of the hypergeometric equation is the equation 
dw 


1 = 
ase) dz 


+(y-(at+ 6+ no —afiz=0. (4.7) 


One solution of this equation is given by the hypergeometric series 


of Hat DEGB+D a (4.8) 


F(a, 8, y,z) = 1+ —z 
(2, B, 7,2) Ly l2yq+D 


This series converges in the disc |z| < 1. Euler considered only the case in which the 
variable is real, and gave two accounts of the equation and the series respectively, one 
in four chapters of the Institutionum Calculi Integralis ((95], Vol. 2, Part I, Chaps. 
8-11), and a later one presented to the St. Petersburg Academy of Science in 1778 
and published posthumously as Euler [98]. 

It is easy, if unilluminating, to solve Euler’s hypergeometric equation by the 
method of undetermined coefficients. Let us denote one solution of it by 
F(-A, 6’, 1 — A — 6, y/x) and another by 


(y/xyP FB, B+ B+A1+B+A,y/x), 
then the corresponding solutions of Eq. (4.6) are 
LS ZF(-A, oe 1— vA- B, y/x), 
— yf, G+A / 
=x y F(8,8+8+A,1+8+A, y/x). 
As Euler had already noticed, special cases arise when \ is a positive integer. 
There are also solutions obtained by the method of separation of variables, z = 


X (x)Y(y). This leads to the equation 


BX _ BY 
x + xX’ =I y’ 


’ 


so both sides must be a constant, a say, and so 


'6See Chaps. 11 and 16. 
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a =p 2 -p' 
X(x)=(x-ay"’, YO)=Q-a)”, 


and so : 
z= a) *(y-ay?. 


Finally, as Darboux remarked, it was shown by Paul Appell in 1882 that if p(x, y) 
is an arbitrary solution of Eq. (4.6) then the most general solution is of the form 


: b b 
(cx +d)P(cy +d)" o (= + ay + ) 


cx+d’cy+d 


where a, b, c, d are arbitrary constants and ad — bc 4 0. 


4.6 Exercises 


1. The hypergeometric equation is the ordinary differential equation 


fo d 
z(1 -2)=5+0- (at B+ Dx) — asw aa 


Show that the series 


a ti aa+ )PG+1) 2 


F(a, B,y,x)=1+ x 
(a, B, 7.) Ly 127+) 


is a solution of the equation, and find another, linearly independent solution. 

2. The series reduces a polynomial if either a — 1 or 3 — 1 is anegative integer, and 
is not defined at all if 7 is a negative integer or zero (this case Euler excluded). 
Show that in all other cases the series is convergent for x = a + bi provided that 
a+b? <1. 


Questions 


1. In what ways did Euler’s approach to mechanics improve upon Newton’s? 

2. Find out what you can about the motion of the Moon. Would it be easy or difficult 
to use knowledge of the motion of the Moon to determine longitude at sea? A 
fascinating take on this is provided by the story of Harrison’s chronometers; see 
what you can find out about them. 


Chapter 5 M®) 
The Early Theory of Partial Differential cies 
Equations 


5.1 Introduction 


As methods for dealing with two or more independent variables advanced it became 
possible to pose questions about partial differential equations. First, as with ordinary 
differential equations, mathematicians looked for formal methods that would lead to 
exact, general solutions, and this led to questions about the existence of integrating 
factors. Euler discussed the problem of finding integrating factors for expressions of 
the form dx + a(x, y)dy in volume 1 of his Institutionum Calculi Integralis [94], 
and considered many types of cases without being able to show that one always 
existed.! However, as we shall see, a solution to this problem had been published 
a little earlier by d’ Alembert. The method of characteristics was introduced for the 
first ttme by Euler and d’ Alembert, and extended by Lagrange and Gaspard Monge, 
who applied it to first- and second-order partial differential equations. 


5.2 Euler’s General Theory of Partial Differential 
Equations 


Euler began to think of creating a theory of partial differential equations in the early 
1760s and outlined how he proposed to start in his [88, 89], where he looked at what 
later generations would call linear and quasi-linear first-order partial differential 
equations in two variables. For reasons of brevity, we shall pass to his later, more 
general account where he repeated most of this analysis and also discussed second- 
order linear partial differential equations. This is Institutionum Calculi Integralis 
Volume 3 of 1770. 


'This volume was published in 1768. It and the next volume deal with ordinary differential equations, 
the third volume with partial differential equations. There is a useful English translation of all three 
books by Ian Bruce, available at the Euler Archive. 
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He dealt with first-order equations of various kinds in Sect. | of Part 1 of the book 
by working through a long series of examples of steadily increasing complexity. He 
would present what we might call a theorem as a problem, show how to tackle it 
in general, comment on the solution, perhaps by obtaining it another way, and then 
work through some examples. As so often with Euler, it is not clear whether he was 
doing this solely on pedagogical grounds or if he was writing up a version of his own 
original route through the topic.” 

In Chap. 2, Euler began with the simplest case of a first-order partial differential 


equation: 
az 


a 


’ 


where a is a constant, and the solution is z = ax + f(y). He carefully pointed out 
that the arbitrary constant of integration that arises is an arbitrary function of y. Not 
only is it not necessarily given by an equation but also its graph may be a curve that is 
drawn by a free motion of the hand, or indeed by several such curves not connected in 
any way. This insistence on the extreme arbitrariness of the curve was original with 
Euler and not shared by all his contemporaries, and as we shall see, he specifically 
noted that the initial position of a vibrating string may be given by such a function. 

In this chapter, Euler solved first-order partial differential equations of various 
forms. He wrote 


and set himself as Problem 21 the equation 
px+qy=0. 


He wrote dz = pdx + qdy, eliminated q, and deduced that 
x 
dz = p(dx — (x/y)dy) = Be 


Both sides are therefore differentials of a function, and so he deduced that py must 


X 
p=s'(*), 
y 


where f’ is the derivative of an arbitrary function. Therefore, 


a= (2)e(2) 


The translation is taken, with slight modification, from Ian Bruce’s version on the Euler Archive. 
I have changed Euler’s notation to make it more like ours. 


. x 
be a function of —, say 
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and by integrating both sides, he found that 


-=1(9) 


(It would seem that Euler’s idea of arbitrariness has shrunk to implying differentia- 
bility.) 

In this chapter, Problem 22, Euler got close to the general first-order linear partial 
differential equation (to use a more modern term). He wrote 


a function z(x, y) is sought for which, on writing p = zy andq = zy onehasg = pV, where 
V=V«Q,y). 


As before, he worked with differentials—these are differential equations, after all. 
He observed that the given equation and the identity dz = pdx + qdy imply that 


dz = p(dx + Vdy). 


He then remarked 


Now a multiplier M will be given, likewise a function of x and y, so that M(dx + Vdy) 
is made integrable. Therefore there is put M(dx + Vdy) = dS, and also S will be given a 


dS 
function of the same x and y. Hence since there shall be dz = ao it is evident that the 


quantity £ must be equal to a function of S', whereby if we put . = f’(S), there becomes 
z = f(S) and thereupon will be 


p = Mf'(S) and g = MVf'(S). 
We shall turn to the multiplier shortly, but first, let us unpack his solution. We 


can interpret the solution this way: if there is a function M = M(x, y) such that 
M (dx + Vdy) is integrable, and say the integral is S(x, y) so 


M(dx + Vdy) = dS, 


then the solution to the partial differential equation is an arbitrary function of S. This 
is because p 
dz = p(dx + Vdy) = Figs = f'(S)dS, 


so, integrating f’(S), 
P 
= | —dS+G(x), 
Zz i u + G(x) 
where G(x) is an arbitrary function of x. 


Euler’s comments are interesting because they are circumspect. He went on 


Corollary 5.1 Therefore in this case the function sought z is found at once expressed 
in terms of x and y, because S is given by x and y. But it can come about the S gives 
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rise to a transcending function, so that moreover by the methods so far known the 
multiplier M indeed cannot be found. 


Euler’s remark that “the multiplier MW indeed cannot be found” may well mean that 
the multiplier might simply not exist. In his ([88], Sect. 20), he had said that this was 
too much to hope for, so his later opinion depends on what he made of d’ Alembert’s 
discussion, published in d’ Alembert [56], which we shall look at shortly. Perhaps he 
meant, in agreement with d’ Alembert, that the multiplier cannot be found explicitly. 

With Problem 23 Euler reached what we would call the general first-order lin- 
ear partial differential equation in two variables. He wrote the partial differential 
equations he was interested in the form 


—=V—+U, (5.1) 


where U and V are functions of x and y (see Sect. 1, Chap. 5, Problem 23, Sect. 
146). He then solved this equation by again recalling that dz = pdx + qdy, passing 
to the differential equation 


dz = p(dx + Vdy) + Udy, 


and then looking for an integrating factor M = M(x, y) such that M(dx + Vdy) is 
exact, say 
M(dx + Vdy) =ds5S. 


He argued that when M is found then 


pds 
dz = —— +Udy, 
z M y 


and in this equation z, and therefore U and M, can be treated as functions of y and 
S. This equation may be integrated if S' is held constant, to yield 


z= [ Udy=7+ FS) 


where T is a function of y and S. So the solution is given as a function of y and 
S. But still, Euler did not discuss how M the integrating factor, or multiplier as 
he called it, can be found, except to note that it is sufficient that a solution of the 
equation dx + Vdy = 0 can be found, and he was unable to give a general account 
of this aspect of the problem. As he put it: 


3From this it follows that 
dT = Udy+WdsS, 


and so, using the fact that £ = ae, f=W+ f'(S). 
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COROLLARY 2 Sect. 148. To this end it is convenient to consider the differential equation 
dx + Vdy = 0; for if this can be integrated, likewise thereupon it will be possible to deduce 
the multiplier M, so that the formula M(dx + Vdy) truly becomes the differential of a 
certain function S, which therefore hence may be found. 


Here as always Euler was not interested in fitting the general solution to a set of 
initial or boundary conditions. Just as with ordinary differential equations, the full 
solution was a formula with a degree of arbitrariness to it. 


Two questions arise: What did Euler mean by saying that a multiplier might not 
always exist but that it was sufficient that a solution of the ordinary differential 
equation dx + Vdy = 0 can be found, and why did he not solve the ordinary dif- 
ferential equation, or at least say that it can be solved? He had, after all, given his 
solution method for the first-order ordinary differential equation in Volume | of the 
Institutionum Calculi Integralis. 

To deal with the multiplier issue first, in Problem 22 the partial differential equation 
is 

vy — Vox = 0. 


This is an equation for an unknown function v = v(x, y). What can we say about 
the curves given by u(x, y) = vo, where vp is a constant (the level curves of the 
function v)? Along these curves, dv = 0. But we always have dv = v,dx + vydy, 
so, comparing this equation with the given partial differential equation, we find that 


Q 
< 

= 
# 

_ 


so we have the ordinary differential equation 
dx +Vdy=0. 

This has as its solutions the curves v(x, y) = vo. If we write 
dx + Vdy =dv 


then we see that ” 
dv = v,(dx + dy) = v,(dx + Vdy), 


x 


so the integrating factor is v,, where v = u(x, y) is the solution of the ordinary 
differential equation 
dx + Vdy = 0. 


Euler could surely have written this down, and it is not clear why he did not. A 
partial answer is that he had been committed to the formalism of (inexact) differentials 
and multipliers for as many as thirty years by now, and saw no reason to change. 
Another is that he was hung up on the fact that the multiplier is not explicit. 
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What is all the more curious about his remark about the ordinary differential 
equation is that in Volume I of the book, published in 1766, Euler had described a 
method for an approximative solution to any differential equation of the form 


dy 
—=V(x,y). 
Ix (x, y) 


In Sect. 650, Problem 85, he described how, from an initial point (x, y) = (a, b) one 
can proceed in small steps of w in x to aj, d2,..., setting y successively equal to 
bij = b+ Va, b)a, thenb, = b; + V(ay, b))a, etc. The sequence of points (a;, b;) 
lie on a curve that approximates the solution to the differential equation through the 
initial point (a, b). 

He indicated in informal terms that the smaller the steps the more accurate the 
approximation would be, but he recognised that the further the process was continued 
the more the errors would add up and that this process would be particularly liable 
to gross error if V was very large or very small. To investigate the way the error 
can behave, Euler offered an argument (Sects. 656-667) that brought in more and 
more terms for the power series for y. The errors grow fastest when some of these 
terms are large, which can happen when V (x, y) becomes either zero or infinite, and 
Euler finished his account by providing examples to indicate how to work around 
this problem. 

It is far from certain that Euler thought that his remarks about finding an approxi- 
mative solution constitute a proof that ordinary differential equations have a solution. 
He most likely thought that was simply true and never in need of a proof. What he 
offered was what he said it was: a method of finding an approximate solution that 
would work if handled with care. 

It is much more difficult to understand why he did not regard the ordinary differ- 
ential equation as solved and therefore the multiplier found. He might have thought 
that the solution method he had proposed was an infinite sequence of approxima- 
tions, so it did not provide useful answers. But it is hard to believe that he thought it 
subject to any significant restrictions. 


5.2.1 Second-Order Partial Differential Equations 


Euler next turned to the subject of second-order partial differential equations. He 
began his study of the second-order linear partial differential equation with an expla- 
nation of what the first and second partial derivatives are and how they behave under 
changes of variable (see Part 1, Sect. 2, Chap. 1, Sect. 229, Problem 39). He obtained 
these equations, which express the partial derivatives of a function z with respect to 
new variables u and v that are related to the old variables u and v by expressions of 
the form u = u(x, y), etc.* 


4Euler wrote t where we have written v here and throughout. 
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dz dv dz du dz 0z dv 0z du dz 
ax dxdv  dxdu’ dy dydv dy du 


az vy dz | atu az (2 e dv du Pz (2) a2z 


dx2 Ox2 du. dx2 du \ax) Ove dx dx Oudv \dx) Ou2’ 


az av dz au az _ dv du a2z _ Ou du a2z _ dv du az _ du du a2z 


axdy  dxdy dv dxdy du dx dy dv2 | dx dy udv | Ox Dy Dudv Ax dy ue’ 


az a*v dz ; a°u dz ; av\* a2z ae du dz au\* a2z 
dy2 dy2 av. dy2au ay) av2 ' ~ dy dy uav dy) dur" 


In Chap. 2, he then explained how to deal with equations of the form z,, = P(x, y) 
and showed by integrating twice that the general solution is 


c= [ (f pax)ar+ x70) +60). 


where F and G are arbitrary functions. He then discussed what equations can 
be reduced to one of this form by suitable changes of variable, and explored 
how to extend the method of changes of variable to equations of the form zy, = 
P(x, y, Z)Zx + Q(x, y, Zz). 
Section 2, Chap. 3 begins in Sect. 296 with the problem of solving the wave 

equation 

a°z 4 az 

— =q?_~, 

ay? dy? 
where a is a constant. Euler reduced it in the way just described, by showing that the 
substitutions 


t=ax+ Py, u=yx+dy 


transform the equation to 


az 0 
2 gPhy — LORS 32 2.2 =¢ 
(B Cea t (B OV aa, t ay) 


So Euler set 


a=1, B=a, y=1, andé=-a, 


thus reducing the equation to 
a7z 
otou 


which he had earlier shown is solved by integrating twice and has the complete 
solution z = f(t) + F(u), where f and F are arbitrary functions. 
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Euler then wrote 


COROLLARY 1 Sect. 297. Therefore the value z of this is equal to the sum of two arbitrary 
functions, the one is of x + ay, and other of x — ay, and both these functions thus can be 
assumed at will, so that also discontinuous functions are able to be taken in place of these. 


COROLLARY 2 Sect. 298. Therefore any two curves described freely by the hand as it 
pleases are able to be taken according to this usage. Evidently if in one the abscissa is 
taken as = x + ay, and in the other truly the abscissa = x — ay, then the sum of the applied 
lines [i.e. the y -coordinates— Editor’s note] will always put in place a suitable value for the 
function z. 


The first corollary is Euler’s way of saying first that the solution is a sum of an 
arbitrary function of x + ay and an arbitrary function of x — ay. The second one says 
that if the coordinates are changed to u = x + ay and v = x — ay then the solution 
is the sum f(u) + g(v). 

Euler then compared his solution method with d’ Alembert’s, noting that he admit- 
ted a much greater range of candidates for the functions f and F than d’ Alembert 
had done.° Of more interest in the present context is Euler’s Scholium 3 (Sect. 301), 
where he remarked that our solution has this disadvantage because it leads to an 
imaginary expression for this equation 


evidently 


z= f(xtayV—-1)+ F(x -—ayv-1). 
He then noted that 


i= fee tayV=D + 5 fx ayV—1) 


“2 
1 1 
+— F(x + ayv ae F(x — ayVv—-1) 


2/-1 J-1 


will always be real. However, although Euler was confident that this reduction to real 
values would always be possible for curves that he called analytic (i.e. were given 
as explicit functions of a real variable) or could be represented as series of sines and 
cosines, he explicitly doubted that this would be possible for arbitrary curves, drawn 
as he put it by the free motion of the hand. He therefore concluded that this was “a 
great defect in the calculation, on account of which a great many solutions lose their 
power”. 

This means that although Euler had given a recipe for producing a solution to 
the equation (*) by starting with a real expression such as x? or sin x (and obtaining 


5D’ Alembert’s arbitrary functions were nonetheless regarded by him as analytic because, in his view, 
the calculus applied to them. Euler was more and more of the opinion that the initial conditions for 
the wave equation could be given by much more general curves. 
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x? — a’ y’ or sin x cosh ay) he could not be sure that all solutions were of this kind.® 
This reflects a deeper lack of knowledge about the passage from complex functions 
to harmonic functions that only became available after Riemann. 

Moreover, the equation (*) is unique among equations of the form 


az a7z a*z az dz 
A +2B +C +R +58 +Tz+V=0, 
ay? dxdy ax? dy ax ‘ 


where A, B, and C are functions of x and y and B? — AC < 0, in that it has a “twin” 
with B? — AC > 0, and so Euler had no way in to analyse any of the others. 

He therefore confined his attention to partial differential equations that do not 
raise this problem, and Chap. 3 steadily establishes that all second-order linear partial 
differential equations of the form 


az a7z 2 os, Oe dz dz 
2P P R S T V=0, 
dy? axdy = 2 52 dy “ ax oe 


where the coefficients are functions of x and y, can be reduced to ones of the form 


a2z az 
P. 
Sudu Be 


Oz 
+ Q, + Riz+ 8; =0. 
Ou 


All the necessary formulae for the changes of variable had been set out in Chap. 1. 
His conclusion was that the new variables u and v must satisfy the partial differ- 
ential equations 


dv dv 
—=(P d 
ay ( +O) ani 


ou Ou 
=(P 
ay ( Q. 


These are of the form that Euler had already shown how to solve.’ 

I add a few remarks about the generality of the solution to the wave equation.® 
D’ Alembert always maintained that for a function to be a solution of the equation it 
had first to be a candidate, and for that reason analytic; he tried to reject the idea that 
there might be classes of functions to which the calculus did not automatically apply. 
Challenged by Euler, he gave a geometric argument that rested on the idea that the 
curvature of the string must vary continuously, which a modern reader could interpret 
as saying a solution must be twice continuously differentiable as a function of each 
variable. But Euler was willing to contemplate more general solutions, and so he 
gave arguments to show that at points where the solution curve is not differentiable 


He did not say this explicitly but left the result to be deduced from his formulae. 

7Laplace did the same thing, perhaps independently, in his treatment of partial differential equations 
in his [174]. 

8See Liitzen’s [191], 15-23, which also considers Lagrange’s ideas in this regard. 
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there are other solutions that are infinitely close, and so any errors that are introduced 
at such points will be negligible. 
What about the intermediate case 


Because only one second-order term appears this is of the form that Euler had treated 
in Chap. 2. But for this specific equation, his only comments (Sect. 265) were that 
his methods do not apply and it is “allowed to be understood that the resolution of 
this has to be thought out with the greatest hardship”. 


5.3. The Introduction of Characteristics by d’Alembert 


D’ Alembert was already interested in these questions and had a fruitful insight into 
the question of the existence of integrating factors after he met Euler in Berlin in 
1762, that form a response to Euler’s papers of 1763 and 1764. This is his theory of 
characteristics.° 

In the fourth volume of his Opuscules, d’ Alembert argued that given the equation 
dx + ady = 0, where a = a(x, y), itis possible at each point on the y-axis to draw 
through it a curve along which dx + ady = 0.!° In this way, one obtains a family of 
curves, one through each point of the y-axis, along which dx + ady = 0. Then, if 
M is the factor that makes the differential Mdx + Mady exact, the integral of this 
exact differential will be constant along these curves and only vary from curve to 
curve. So, it is enough to prescribe the values of M arbitrarily along the y-axis for 
the function M to be known everywhere in the plane. This is the basic method of 
characteristics: the curves d’ Alembert defined are known as the characteristic curves 
and the method is valid at least locally and for as long as the curves are not tangent 
to the y-axis. 

D’ Alembert described the values of M at each point as a height above the (x, y)- 
plane, so the curves of constant M are curves on constant height, which we might 
call level curves or contour lines. 

He also explained that if M is an integrating factor for a differential dx + ady, so 
M (dx + ady) = du, say, then sois M times any function of u, because f(u)M (dx + 
ady) = f(u)du. 

He did not connect this insight to the method of changing variables when solving 
a first-order partial differential equation. Had he done so, he could have said that 
replacing x by the new variable u reduces the original partial differential equation 
to one of the form z, = 0, whose solutions are z is an arbitrary function of the other 
variable y, so the solution is known everywhere once it is known on the y-axis. The 


°For a revision of the mathematics, see Appendix B. 
!0See d’ Alembert [56], pp. 225-281, and especially 255-258. 
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corresponding differential is dy which is exact, as is any differential f(y)dy. This 
suggests that even at this stage the notion of boundary values for a partial differential 
equation was not clear or established. 

D’ Alembert was clear that the sought-for function M would be far from unique, 
and he commented that “it can often happen that M will not be expressible alge- 
braically although it can always be determined geometrically”. This was his reply to 
Euler who, in his ([88], Sect. 20) had said that this was too much to hope for. But 
D’ Alembert was unhappy that his method was far from being explicit, and so he set 
out a tentative method for finding an expression for M “when this can be done”. 
This method was little more than the hope that there will be a change of variables, 
obtained by regarding x and y as functions of u and z, that will make the transformed 
equation tractable because it is now written in variables that separate. 

What became known as the characteristic curves are the solutions of the differ- 
ential equation dx + ady = 0, and the value of z(x, y) along each such curve is 
determined by the value at an arbitrary given point on it. So, to solve the partial 
differential equation one finds a curve y that is not a characteristic curve but crosses 
all the characteristic curves, and assigns a value to the function z at each point on 
y, and then solves the ordinary differential equation (5.7) with those initial values. 
Under the heading of the method of characteristics, this became a standard technique 
in the theory of linear and quasi-linear partial differential equations. 


5.4 Laplace 


In his memoir (1777), Pierre Simon Laplace, a protégé of d’Alembert’s, presented 
what he claimed was the first systematic treatment of linear partial differential equa- 
tions, one that went beyond the isolated cases treated by d’ Alembert and others.'! He 
praised d’ Alembert as the inventor of the calculus of partial differential equations, 
and mentioned neither Euler nor Lagrange by name, perhaps because d’ Alembert 
had been helpful to him early in the younger man’s career, by securing him a pro- 
fessorship in mathematics at the Ecole Militaire in 1769, the year he turned 20, and 
d’Alembert and Euler had been rivals until the 1760s. 

Laplace began with the first-order equation, which we shall write in a form equiv- 
alent to his as 


+ B— =V, (5.2) 


where @ and # are functions of x and y and V isa function of x and y if the equation is 
linear. (Laplace allowed V to be a function of x and y with a linear term in z, a slight 
generality that we suppress here.) This is the same equation as Euler’s equation (5.1), 
and his solution method was the same as Euler’s, except for a little more clarity about 


'1On Laplace, see Gillispie [120]. 
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the integrating factor that arises. Like Euler, he said nothing about the solutions being 
constant along any characteristic curves. 

Then, in his ({174], 21-41), Laplace turned to the second-order linear partial 
differential equation, which he wrote in the form 


Zxx + OZxy + BZyy + yZy + 6Zy $AZ+T =O, (5.3) 


where a, 6, y, 6,4, and T are functions of x and y. He looked for a change of 
variables u = u(x, y), v = v(x, y) that would put the equation in the form 


Zuy + lower terms = 0. 


From the values of the various partial derivatives of z with respect to the new 
variables, he deduced that the coefficients of z,,, and z,, are, respectively, 


2 2 2 2 
Uy + OUyUy + Buy, and vy + @v,Vy + Bry, 


so, for the transformed equation to reduce to the required form, the new variables 
must satisfy the differential equations 


ur + UyUy + Bu; = 0, v + Qv,Vy + pu, = 0. (5.4) 


He factorised these to get these equations!”: 


Uy = Uy(—a/2 + V/(a/2)? — B), vy = vy(—a/2— V(a/2)?—B), (5.5) 


which are of the type he had shown how to solve earlier in the memoir, as we 
described above. Indeed, to solve them, use the fact that u(x, y) = 0 implies that 
uxdx + uydy = 0, so in Eqs. (5.4) we may replace u,/u, by —dy/dx. This leads to 
the equation 

(dy)? — adxdy + B(dx)* = 0, 


dy : dy 
—) -a— = 0. 
(2) vax TP 


which we write as 


This is a quadratic equation whose solutions are 


O = 0/24 J@/—B=0 and 2 =-a/2~V@/D?—B=+. 


dx 


The characteristic curves are the solutions of these equations. 


!2This corrects a trivial error in Laplace’s paper: he wrote for o:/2 in the square root. 
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Along these curves Eqs. (5.4) hold, so the partial differential equation reduces to 
Zuy = 9, 


which has as its solutions anything of the form f(u) + g(v). So once a characteristic 
curve is found that crosses these curves, the values of the solution are found, just as 
in the simple case of the wave equation, by saying that f is constant along with one 
set of curves and g is constant along with the other. 

Laplace then attempted to solve the original Eq. (5.3) in its reduced form so as 
to obtain its general solution, and to supply extra information to provide particular 
solutions. He set no store by what we, however, would see as an essential difference 
between the cases where ,/(a/2)? — 6 is real and where it is complex. In the real 
case, the change of variables produces new real variables and real characteristic 
curves, but in the complex case, the new variables are complex and the characteristic 
curves are complex. In this latter circumstance, the only hope for Laplace is that 
the imaginary parts will somehow vanish at the end. But he made no remark to that 
effect, and it would seem that he thought that the reduction is always possible. This 
is remarkable, given that Euler had raised an alarm in the simplest case, and because 
it breaks an analogy (which perhaps Laplace missed) with the reduction of a curve 
given by a quadratic equation in two variables to either an ellipse or a hyperbola 
(or a parabola).'? No mathematician would have thought that linear transformations 
could confuse an ellipse with a hyperbola. 

This is further evidence that the understanding of partial differential equations in 
the mid-1770s was often purely formal, and many significant issues remained to be 
discovered. 

We shall just consider the real case. Now there are new variables u and v that, 
respectively, satisfy the equations 


Fg! and eae, 
Ox dy x oy 


and with respect to these new variables the partial differential equation takes the 
form 


az az az 
b = 0, 5.6 
Suan oe vg ©) 


in which a, b, c are functions of u and v. This equation includes, as a particular case, 
the equation for the transmission of sound, which Euler was to consider in 1776 
(as we saw above). Laplace put forward some ingenious but complicated solution 
methods that we shall not be able to pursue. 


'3Strictly speaking, the quadratic must be non-degenerate. 
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5.4.1 Lagrange’s Method 


In his [174], Lagrange took the equation 
Zy = VzytZ 


where V is a function of x and y, and Z is a function of x, y, z, and deduced using 
the identity dz = z,dx + zydy that 


dz = (Vdx + dy)zy + Zdx. 


He then went on Lagrange [174], 83: 
Suppose for a moment that 
Vdx +dy =0, (5.7) 


then I have an equation in two variables, which I integrate, adding an arbitrary constant a. 
Now I regard a as a function of x and y determined by this equation; by differentiation I 
have 

Vdx + dy = Ada, 


A being a function of x, y and a. So, substituting this value in the preceding equation it 
becomes 
dz = Azyda + Zdx. 


Now, if one replaces y everywhere in this equation by its value in terms of x and a one 
thence has an equation in x, z, and a, and supposing @ constant one will have the equation 
dz = Zdx between the two variables x and z, which can be integrated with the addition of an 
arbitrary constant, which will be an arbitrary function of a, thus giving the general solution 
of the proposed equation at once, because it only remains to replace w by its value in x and 


y. 


This is close to being the first statement of the modern approach, that puts the 
emphasis on some ordinary differential equations and drops any consideration of 
differentials and multipliers. '* 


5.5 Exercises 


1. Solve the partial differential equations 
px +qy = 0, 
q= PV, 


q=pV+U, 


‘47 agrange also showed how to vary the argument when V and Z may involve the derivatives of 
the unknown function. 


5.5 Exercises 69 


and compare your solutions with Euler’s. 


Questions 
1. What sorts of solutions did Euler admit to the partial differential equation 
0? 
ox 
2. What sorts of functions would Euler or d’Alembert have supposed prescribed 
values along a transversal to a family of characteristics? 


Chapter 6 ®) 
Lagrange’s General Theory of Partial sic 
Differential Equations 


6.1 Introduction 


The first systematic theories of first- and second-order partial differential equations 
were developed by Lagrange and Monge in the late eighteenth century. Lagrange’s 
approach was not as overwhelmingly algebraic as the bulk of his work might sug- 
gest; in particular, he seems to have introduced the idea of envelopes of curves and 
surfaces into the study of differential equations. Monge’s work is, however, the start 
of the modern geometric theory of partial differential equations, and we shall defer 
consideration of it to Sect. 8.2 below. 


6.2 Clairaut’s Paradox 


In 1734, Clairaut had raised a paradoxical finding that can be described in modern 


d 
terms as follows. Let p = pe and consider the differential equation 
x 


y=xp+ f(p). 
Differentiating this gives, 
p=p+xp'+ p'f'(p), 


or 
p(x + f'(p)) = 9. 


This, combined with the original equation, gives a solution in the parameterised form 
x=—f'(p), y=xpt f(p). 
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What struck Clairaut and others as remarkable was that the solution to a differential 
equation had been found by further differentiating and not by integrating, as was to 
be expected. 

When Euler wrote about this some twenty years later (in his E236) he opened by 
saying 

I propose here to study a paradox in the integral calculus that can appear very strange: 

We sometimes encounter differential equations for which it would seem very difficult to 

find the integrals by the rules of integral calculus, and which however are easy to find, not 

by means of an integration, but rather by differentiating the proposed equation again; so a 

repeated differentiation leads us in these cases to the sought-for integral. It is undoubtedly 


a very surprising accident, that differentiation can lead us to the same goal, that we are 
accustomed to find by integration, which is an entirely opposite operation. 


Euler went on to connect this paradox to another unexpected result: a differential 
equation may be solved by an expression that does not arise from the general solution 
by a choice of the arbitrary constant it contains.' He gave this example, 


xdx + ydy = dyvVx? 4+ y?— a’. 


It has the circle with equation x* + y? — a* = 0, whichimplies xdx + ydy = 0, asa 
solution, and also the general solution, which is a one-parameter family of parabolas 


/x2 + y2-a@ayte 


or 
2 


vr -a =2yc+c’, 
but the circle cannot be obtained as one of a family of parabolas. 

Euler’s explanation of the paradox in this paper did not get close to solving it. 
He returned to the question in 1763 in the first volume of his Institutiones Calculi 
Integralis (Part 1, Sect.2, Chap.4, see Sect. 594 of the English translation), but he 
looked for an algebraic criterion to resolve it, and that does not get to the heart of the 
matter. 

The envelope of a family of curves is a curve that touches each member of the 
family. To put that another way, it is a curve that has, at the point where it meets a 
curve of the family, the same tangent as the curve of the family does. It is a good 
exercise to check that the parabolas in Fig. 6.1 envelope the circle in this sense. 

Let the family of curves by given by the equations f(x, y,a) = 0, where a is a 
parameter that varies from curve to curve. Let the curve g(x, y) = 0 be the equation 
of the envelope. Differentiating the first equation says that, at points on curves in the 
family, we have df = 0, so 


fc(%, y, dx + fy(x, y, a)dy + fa(x, y,a)da = 0. 


‘See Capobianco, Enea, and Ferraro [29] for a discussion this problem posed for Euler’s ideas about 
the foundations of the calculus. 
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Fig. 6.1 A one-parameter 
family of parabolas 
enveloping a circle 


A 
Ke 


7 


If we fix a value of the parameter a then da = 0 and so at points on that specific 
curve in the family 


dy Sc (x, y, a) 
x(x, y,a)dx + fy(x, y,a)dy =0, so =— ; 
frQy Sy@, y, a)dy re Fis 


Differentiating the second equation says that, at points on curve g(x, y) = 0, we 
have dg = 0, so 


d (2, y) 
gr(x, y)dx + gy(x, y)dy =0, and 2 = — 2”, 
dx By (x, y) 


The slopes of the tangent at the curve with parameter a and the curve g(x, y) = O will 
therefore be the same at a point (x, y) where they meet if at that point f, (x, y, a) = 0. 
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So the envelope is the set of points where both f(x, y,a) = O and fa(x, y,a) = 0, 
so it is found by eliminating a between those equations. 

One way of looking at envelopes is to imagine the curves f(x, y, a) = 0 forming 
a surface in (x, y, a) space. The equation 


Ie, J; a)dx ow Ay, J; a)dy + Sax, y, a)da = 0 


can be written as 


(fc, y, a), fy, y, a), falx, y, a)).(dx, dy, da) = 0, 


which says that (f,(x, y, a), fy(x, y, a), fa(%, y, @)) is anormal to this surface. The 
further condition that f,(x, y, a) = O says that the normals of interest are those that 
lie in the (x, y)-plane, and this picks out the part of the surface that you see if you 
look down the a-axis from a long way away. 


6.3 Lagrange 


A decade later, Lagrange (see Fig. 6.2) made a systematic study of first-order partial 
differential equations in his three papers, [173-175]. We shall look at his method for 
solving a general first-order partial differential equation in his (1772); his account of 
complete, general, and particular solutions from his [173] where he solved Clairaut’s 
problem; briefly at his discussion of problems involving more than two indepen- 
dent variables in his [175]; and finally at his lectures on the subject at the Ecole 
Polytechnique in 1806. 


6.3.1 Lagrange [173] 


In his [173], Lagrange considered the general first-order non-linear partial differential 
equation in two variables x and y for an unknown function u(x, y). This was an 
ambitious undertaking, given what little had previously been discovered about partial 
differential equations, and he modelled his approach, naturally enough, on what was 
known about linear equations and the method of integrating factors.” 

A partial differential equation is defined by a function F of the five variables 
x, y,U, p,q that satisfies the equation F(x, y, uv, p, q) = 0, where, as he wrote p = 


u u : ' 
ax =u, and g = By = uy, and so, when u is known as a function of x and y, 
5 ) 


y 
du = pdx + qdy. The crucial difficulty with a non-linear equation is that p and g 


2See Engelsman [67]. 
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Fig. 6.2 Joseph Louis 
Lagrange (1736-1815), artist 
unknown 


no longer occur to the first power but may be squared, multiplied by wu, or in other 
novel combinations. 

Lagrange supposed that the partial differential equation can be written in the form 
q = q(x, y, u, p). The problem, as he now saw it, was to find p in terms of u, x, and 
y so that the expression du — pdx — qdy is integrable. Before we see why this is 
true, note that this had been observed by Euler in his ([96], Sect. 128), who remarked 
that the necessary condition is that, on setting 


a 
bee) 2? Ph gs em 
Oz dy ox 


one has 


Lp+Mq+Nz=0, or p 


However, Euler made no remarks on how this equation could be satisfied, and turned 
to other aspects of the theory. 

Lagrange’s (and Euler’s) observation is valid because the existence of an integrat- 
ing factor M such that M(du — pdx — qdy) is exact implies that 


SG (6.1) 


and conversely, a solution p of Eq.(6.1) implies that an integrating factor L for 
pdx + qdy can be made to supply a suitable M. 
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To show this, Lagrange argued along these lines. Fix a value of u, then there is an 
integrating factor L such that L(pdx + qdy) = dt, where t is a function of x and y; 
L is found by solving the ordinary differential equation pdx + qdy = 0. Now let u 


dt dt 
vary. It is clear that (ZL + ae — dt = 0 is integrable. He defined P = L + —, 


and showed that condition that P as a function of t, y, and wu is a function of t and 
u only is (6.1). He then let L’ be the integrating factor for Pdu — dt, and deduced 
that L'L(du — pdx — qdy) is exact, and so LL’ = M. 

Let p be a solution of (6.1). Lagrange now showed, by an argument I omit, that it 
is enough that p contains an arbitrary constant for a complete solution to the original 
partial differential equation F(x, y, u, p,q) = 0 to be found. 

So in principle, Lagrange had discovered a general method for solving non-linear 
first-order partial differential equations in two independent variables, although he had 
to admit that it could be too complicated to follow even in cases when the solution 
was already known, for example, for the equation 


q=pXxX +Y, 


where X = X(x, y) and Y = Y(x, y). 

True though it is that the various conditions on L, L', and M can be translated into 
the equations for the characteristics that define the modern solution (see Appendix 
C), the summary account of Lagrange’s work by Eduard von Weber in the German 
Encyclopedia goes too far in implying that this was known to Lagrange. The use of 
characteristics is an important later development.* 

Still less was there progress on partial differential equations in more than two 
independent variables, despite some inconclusive remarks at the end of the paper. 
Here, the problem at the time was that the integrating factor method cannot work: 
there is no good theory of an integrating factor for expressions of the form udx + 
udy + wdz because the analogue of Clairaut’s equations yields an over-determined 
system consisting of three equations for the integrating factor. 


6.3.2 Lagrange [173] 


We now turn to his account in his [173] of the types of solution a partial differential 
equation may have, and the relationships between what he called complete, general, 
and particular solutions. 

Before looking at it, it will help to look at the account in Courant and Hilbert 
({49], Vol. 2, 22-27), which is clearer. Suppose that a first-order partial differential 
equation F(x, y,z, p,q) = 0 has a family of solutions z = f(x, y, a) that depend 
on a parameter a. If this family has an envelope, then this envelope is also a solution. 
Geometrically, this is clear: the envelope shares a tangent plane at the point of contact 


3See Weber [264], 338, written in 1900. 
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with the curve with parameter a, and this plane, therefore, belongs to the family of 
tangent planes defined by the partial differential equation. 

A complete integral of the given partial differential equation is one that depends on 
two independent parameters, so the solution is of the form z = f(x, y, a, b). Given 
a complete integral, one can impose an entirely arbitrary relation between a and b, 
say b = b(a), and then the one-parameter family of surfaces z = f(x, y, a, b(a)) 
has an envelope which is again a solution. What makes this valuable is that the new 
solution is obtained by differentiation (which is easy) and elimination of a parameter 
(which, however, may not be easy). So a general solution is one that depends on one 
parameter, either by imposing a relation on the parameters such as b = b(a) or is an 
envelope. 

Finally, singular solutions of the partial differential equation are obtained from 
the two-parameter family of solutions as those envelopes that do not arise from a 
one-parameter family. 

Gaspard Monge gave this example in his Applications d’analyse [202], Chap. 7. 
Consider the two-parameter family of surfaces 


(x-—a’ +(y—bP + 2 =1. 


They are all spheres of radius | with centres in the (x, y)-plane, and it is easy to see 
that these surfaces all satisfy the partial differential equation 


2+ yt+y)al. 


We shall see that this makes the surfaces what is called a complete integral of the 
partial differential equation. 

Consider the one-parameter family, where b = b(a)—this selects just those 
spheres with centres on the curve y = b(x) in the (x, y)-plane. The envelope of 
this family is obtained by eliminating the parameter a from the equations 


(x —a) + (y —bY +22 =1, and x —a+Jd'(a)(y — b(a)) = 0. 


It is a tubular surface, called a canal by Monge. 

The planes z = | and z = —! are an envelope of the two-parameter family of 
spheres, and they satisfy the partial differential equation. They are singular solutions 
of the partial differential equation, and they are not a tubular surface. 

In his [174], Lagrange offered what he believed was the first general analysis 
of the problematic phenomenon first discovered by Clairaut in 1734. He noted that 
Euler had discussed it, as by then had d’Alembert, Condorcet, and Laplace (in a 
paper seen by Lagrange but not yet published). Now Lagrange proposed to offer a 
new and complete analysis. 

He took Euler’s example of the differential equation 


xdx + ydy = dyVx? + y*-a?, 
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which has the solution 


/x2+y2-a*=yt+e 


or 
x? —2%y-a —c =0, 


where c is an arbitrary constant, and worked his way round to explaining that the 
circle is the envelope of the family of parabolas. 

Lagrange now looked for the a priori reason for the existence of the phenomenon. 
He found that the differential equation specifies a condition on the solutions that is 
also met by the envelope of the complete integral, and he expressed this condition 
first in formal analytic terms and then in geometric terms. He also went on to note 
what happens when there is no envelope or only a one-point envelope; this explains 
other aspects of the original question. 

In Article V of the paper, Lagrange then extended this analysis to partial differen- 
tial equations. A partial differential equation in three variables has two independent 
variables, so a complete integral will have two arbitrary constants, he explained. A 
general integral has one, and a particular integral none. 


6.3.3 Lagrange [175] 


The paper Lagrange [175] deals mostly with particular kinds of partial differential 
equations in which some geometric condition is imposed, and in it Lagrange also 
generalised some arguments he had used in his [173] to address the question of 
solving first-order quasi-linear partial differential equations (to use a more recent 
technical term) with any number of independent variables.* 

He wrote the partial differential equation in a form that differs only notationally 
from this: 


Oz Oz Oz 

—+P;—++::-+P, = 2, 

a "Ox "OXn 
where z is an unknown function of the n + 1 variables x, x,,...x, and P,,...P,, 
and Z are known functions of x, x,,...X,, and z. 


His method was to form the n ordinary differential equations 


These can be solved, and the solutions involve n arbitrary constants a1, ..., Qn. If 
these constants are regarded as functions of x, and g is an arbitrary function of 
Q2,..., @,, then the solution of the partial differential equation is given in the form 


4We shall pick up that story when we look at mechanics and Hamilton-Jacobi theory in Chap. 25. 
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a= g(a, ro) Qn). 


To see what this means, it helps to let n = 2 and to work in (x, x), x2) space, 
which is three-dimensional. We now fix values of a; and a2. This gives us a curve in 
(x, X1, X2) space as x varies. If we now let a be a function of a; then these curves 
form a surface. This surface is a solution of the partial differential equation because 
it is composed of curves that must lie in the solution surface because their tangents 
satisfy the partial differential equation. 

Lagrange did not prove this claim but said that the proof was contained in his 
paper of 1774. His comment on the method is interesting [175], 625: 


By this method one can therefore integrate in general every first-order partial differential 
equation in which the differentials only appear in a linear form, whatever the number of 
variables; at least the integration of these sorts of equations is reduced to that of some 
ordinary differential equations; but one knows that the art of the integral Calculus of partial 
differential equations only consists of reducing this calculus to that of ordinary differential 
equations, and that one regards a partial differential equation as integrated when its integral 
depends only on that of one or more ordinary differential equations. 


6.4 Exercises 


1. A ladder slides down a wall which is at right angles to the ground. What curve 
does it envelope? (Hint: use the angle of the ladder to the vertical as a parameter.) 

2. Verify the claims made in connection with Clairaut’s paradox. Show that if 
f(x, y, a) = x? + 2ay — a? — b* = O then eliminating the parameter a from that 
equation and the equation 2 f(x, y, a) = Oyields the equation y = a, from which 
the result follows. 


Questions 


1. How important would you say that one-parameter families of curves were in the 
mathematics of the early eighteenth century? 


Chapter 7 M®) 
The Calculus of Variations Betis 


7.1 Introduction 


Problems in which the solution is a curve (or perhaps a function) with some maximal 
or minimal property began to be studied at the end of the seventeenth century, as we 
saw in Chap. 2. Euler was the first to make a systematic study of problems of this kind, 
and his book the Methodus inveniendi (1744, E 65), which is full of different kinds 
of examples stimulated the young Lagrange to invent the methods of the calculus 
of variations. These were not the modern methods, but an inspired, and mostly 
unexplained, formalism that was very useful but by no means clear. Out of these 
insights, the so-called Euler-Lagrange equations were discovered, the law of least 
action was proclaimed, and Lagrangian dynamics was created. 


7.2 The Euler-Lagrange Equations Discovered 


There are few topics in the history of mathematics that have a clear beginning. Most 
emerge out of a shifting array of problems, undergo various reformulations, split into 
branches, merge with others, and acquire new aspects. The calculus of variations 
went through these preliminary stages so quickly that it may almost be said to have 
begun with Euler in his Methodus inveniendi. Here, Euler set out a general method 
for finding functions that minimise or maximise a given integral, and are perhaps 
subject to other constraints. 

Euler thought of the calculus in the language of infinitesimals.' So he took three 
points on a curve that is the graphical representation of y a function of x, say (x, y), 


‘A good account of Euler’s method is provided in Fraser [106]. Importantly, it does not follow the 
accounts of Carathéodory [31] and Goldstine [121] in interpreting Euler’s arguments by bringing 
them into line with later methods, which obscures some of the points that Lagrange was to criticise. 
For English translations of some of the work of Euler and Lagrange, see Struik Source Book 399-413. 
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(x’, y’), and (x”, y”) that are infinitesimal distances apart and varied the curve so that 
it passed through the infinitesimally nearby point (x’, y’ + nv) instead. He thought 
of these changes as if they were finite and then treated them as infinitesimal, not as 
limits of finite changes. 

Euler now supposed that the curve is the one that maximises or minimises a certain 
integral over an interval of a function Z of x, y, and perhaps some derivatives of y 
with respect to x. Because the integral is an extreme, the effect of these infinitesimal 
changes must be zero. Euler let the infinitesimal horizontal distances be dx, the 
change in y at x be dy, and defined p by the equation dy = pdx, with similar 
changes at x’ = x + dx and x" = x'+dx. 

He then considered the effect of these changes on the integral. The change is a 
sum of infinitesimal amounts that reflect the new value of y at x’ and the consequent 
changes in any other quantity that enters the integral, so the changes are Zdx + 
Z'dx +--+, where Z is the value of Z at (x, y, p), Z’ is the value of Z at (x’, y’, p’), 
and so on. But the change in the curve is concentrated in the infinitesimal region 
around the points (x’, y’) and (x’, y’ + nv). So the change in the integral is Zdx + 
Z'dx. 

Euler wrote 


dZ = Mdx + Ndy+ Pdp, dZ' = M'dx + N'dy' + P'dp’, 


and proceeded to calculate M, N, P, M’, N’, P’ in terms of the change nv to the 
curve. He found that 
nv 


nv ; 
dZ = P—, dZ’ = N'nv — P'—. 
dx dx 


So the change in the integral is given by 
(dZ +. dZ')dx =nv(P+N'dx— P’), 


and this, because the integral is an extremum, is zero. 
Euler wrote P’ — P = dP, replaced N’ by N, and obtained Ndx — dP = 0 or 


[inne (7.1) 


as a necessary condition for the curve to be an extremal of the integral. This is the 
first occurrence of the Euler-Lagrange equations in the calculus of variations. 

The confusion this exposition induces is not simply a matter of the use of infinites- 
imals. As Fraser notes ([106], 185), Euler has used the d symbol in two ways. First, 
to denote a fixed infinitesimal separation of the x coordinates, and this was stan- 
dard practice in the Leibnizian calculus of the day. Second, to denote the change in 
quantity consequent upon the change in y’, and among these we find that dy’ = nv, 
dp = nv/dx, and dp' = —nv/dx while in this sense dx = dy = dp” = 0. Euler 
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then extended this method to apply to variations subject to constraints, which inci- 
dentally had the effect of highlighting the two uses of the d symbol. 

The next year Euler received a response to his book that surprised and delighted 
him. This was a letter written by the 17-year-old Lagrange on 12 August 1755 in 
which he set out a new method for tackling these problems, which he illustrated with 
the solution of three examples. 

In the letter, Lagrange introduced the symbol 6 for the variation in the curve, so, 
for example, 5x = 0. He wrote 6 Fy to denote the change in F’, a function of y, and 
claimed with very little justification that 


diFy = bdFy, (7.2) 


and so in particular ddy = ddy. His method thereafter is a judicious combination of 
the new rule in Eq. (7.2) and integration by parts, and among other results Lagrange 
obtained a better derivation of equation (7.1), which now deserves to be called the 
Euler-Lagrange equation. 

As Fraser notes ([106], 163) Lagrange surely introduced his new symbol 6 to 
sort out Euler’s ambiguous use of d, then somehow came to the belief that d and 
6 commute, and had a good idea (which Euler had missed) of using integration by 
parts. This is most useful when the integral has fixed end points because the variation 
is necessarily zero there. 

A rich correspondence between the two men followed. Euler did not, at first, 
appreciate the way Lagrange’s 6 works. He also queried the crucial step in which the 
passage from the vanishing of the integral that expresses the variation in the original 
integral leads to the vanishing of the integrand and thus the Euler-Lagrange equation. 
Lagrange, in his reply, gave a general argument that remains unconvincing. 

Between 1756 and 1760, Lagrange refined his method and focussed it on prob- 
lems in mechanics, including the brachistochrone problem, and in 1760-1761 he 
published his own account of the calculus of variations [170]. The argument he set 
out in this paper is opaque at key stages, and as the calculus of variations developed 
numerous interpretations were provided that culminated in what today is called the 
direct method. As Fraser points out, the modern approach is not a reasonable inter- 
pretation of what Lagrange did, and the reader is referred to Fraser’s paper for the 
details. 

Lagrange argued that to find the minimum or maximum of an integral, say f Z, 
one does as one does in the calculus and differentiates and equates to zero: 


sf z=0. 


Here, Z is to be regarded as a function of variables x, y, z and their differences 
dx, dy, dz, ax, d’y, d?z,.... The equation he said—offering no explanation— 
could also be written as 

/ 5Z = 0. 
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So one writes Z out in full, and “as one sees easily” 
dx = dbx, 5d°x = d’6x, 


“and so for the others”. 

He then suggested that in any given problem, there is always some relationship 
between 6x, dy, dz, dbx, doy,.... 

We can at least look at how Lagrange determined the brachistochrone or curve of 
quickest descent between two points.” He took x, y, and z as a set of three mutually 
perpendicular axes—with the x axis vertical—so the time of descent is given by 


d 
eg Z= =, ds = Vdx? + dy? + dz?. 


The integrand Z involves four terms, dx, dy, dz,ds, so according to Lagrange’s 
rules 5Z is a sum of these four terms: 

bdxds  dxéddx dyddy dzéddz 

2%J/t fads’ <[xds’ /xds' 


and all other quantities in the general theory vanish. In his terminology, 


ds dx dy - dz 
SS OO 


2xJx’ — Jxds’ /xds’ J/xds 


To find the curve of quickest descent, one therefore has the equations 


n—dp=0, —dP =0, -do=0, 


and so 
ds dx dy dz 


d = 0, 
2x /x /xds /xds /xds 


For these three equations to represent a unique curve it is necessary, he said, that they 
reduce to two, which they do because, as he showed, the second and third imply the 
first. 

He then integrated these two equations and obtained 


dy 1 dz 1 


Jxds fa’ Jxds Jb’ 


whence 


dy vb 


dz Ja 


2See Lagrange ([170], 339-341). 
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This equation shows that the solution curve lies in a vertical plane. He now assumed 
that the x-axis passes through the curve, which takes care of the constant of integration 
when integrating both sides of the equation, and gave the equation of the plane as 


Ja 
Z=y—. 


vb 


Lagrange then took coordinates x and t in this plane, where \/ y” + z? = t. This gave 
him z as a function of ¢ and y as a function of f, and finally, on setting ae =Cc,a 
differential equation connecting x and f: 


/xdx 
Je— x’ 


which is “the equation for a cycloid described on a horizontal base by a circle of 


diameter equal to c’”.* 


dt = 


7.3 Maupertuis and the Principle of Least Action 


Abstract though it is, the principle of least action was the occasion for the most 
unpleasant scientific controversy of the century. Pierre Louis Maupertuis, the head 
of the Berlin Academy, presented a paper to the Berlin Academy in 1744 in which 
he claimed to show that light travels in a way that continually minimises a quantity 
called its action (a concept to be defined below). He published an extended version 
of the same argument in 1746, which he now applied to the motion of a mechanical 
system. He also gave the principle a profoundly teleological spin by suggesting that 
the system evolved to meet a pre-assigned goal, and made it the animating principle 
of all of nature and the basis of a proof of the existence of God. As he observed, 
Euler had made a strictly mathematical statement of the principle in his Methodus 
inveniendi (E 65, [74]), but Maupertuis’s claim was far grander. 

In essence, Maupertuis’s claim was that a benign deity had seen to it to produce 
a world in which everything happened with a minimum of effort, or rather, that a 
world in which things happened with minimal effort was evidence for a benign deity. 
Quite why action corresponded to hard work was not clear, and the whole claim was 
ridiculed by Voltaire in 1759 in one of the great books of the Enlightenment, Candide. 
Voltaire could not accept that we lived in the best of all possible worlds when all 
too painfully it was the world of the Seven Years War and the Lisbon earthquake of 
1755. 

Maupertuis was a vain man who courted fame. He was known as ‘The Great 
Flattener’ because he had led the successful French expedition to Lapland in the late 


3It is easier to follow this argument on choosing the axes so that everything happens in the plane 
z = 0. See also the derivation in Sect. 7.5. 
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1730s that verified the flattening of the Earth at its poles and helped to persuade 
Continental Europe of the merits of the Newtonian calculus and Newton’s theory of 
gravitation.* He was friends with Voltaire, and, importantly for this story, the Swiss 
Academician Samuel Kénig, who Voltaire had persuaded to teach his lover Emilie 
du Chatelet algebra.° 

However, in 1751, Samuel Konig published a paper criticising the principle of 
least action. Maupertuis took it personally and accused the author of plagiarism. He 
charged KOnig with forging a letter by Leibniz stating the principle of least action, 
which would have denied Maupertuis priority. When this failed, he obtained Euler’s 
support and tried to have K6nig driven out of the Academy. It seems that Euler, who 
was generally a benign man, usually got his way with Maupertuis by humouring 
him, but felt that on this occasion he had to give something back. Maupertuis’s 
actions enraged Voltaire, who promptly published a pamphlet entitled Diatribe du 
docteur Akakia, medicin du Pape. Emperor Frederick, who took a lordly interest 
in his Academy in Berlin, publicly supported Maupertuis, and was so enraged by 
Voltaire’s pamphlet that he had it burned by the hangman in public places in Berlin. 
Matters eventually calmed down, and in 1752 K6nig was given an official censure 
by the Academy but allowed to remain as an Academician. 

What, more precisely, was the principle of least action? Maupertuis expressed it 
this way: 

Whenever there is a change in nature, the quantity of action necessary for this change is as 


small as possible. The quantity of action is the product of the mass of the body by its speed 
and by the distance through which it has moved. 


Euler, in the second appendix to his Methodus inveniendi, wrote that the path of a 
body of mass M moving with a speed that it would have acquired by falling through 
a height v travels a distance ds has a quantity of motion of Mds./v and 


I say that the path the body will describe, by comparison with all the others with the same 
start and end points, will minimise i Mds./v, or, if M is a constant, f ds./v. 


If we say that in falling a height 4 from rest a body of mass M loses an amount of 
potential energy equal to M gh, which is equal to the amount of kinetic energy it gains, 
5M u’, then h = £ and Euler is claiming that the path of the particle minimises the 
integral 


| aR 
uds, 
2g 


which is what Maupertuis said, and which we recognise as the product of the momen- 
tum of the body times the distance through which it has moved in an instant. If we 


_ ds “ : M 2 
also note that uv = ©, then we can say that the integral is Word fwdt. 


4See Terrall [253] for a detailed account. 


5Du Chitelet is remembered as the mathematician who translated Newton’s Principia into French. 
For an account of the difficulties this involved, and the merits of her extensive commentaries, see 
Zinsser [278]. 
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This agrees with the definition of the action in use today, which, in problems 
involving particle motion, is the difference of the kinetic and potential energies in 
the motion. If the kinetic energy is represented by T and the potential energy by V, 
then the action is T — V, and in problems about motion under gravity, we have that 
T=-V,soT—V=2T. 

The crucial point is that whatever deductions follow from the new principle have 
to agree with those that follow Newton’s laws. So the correct formula for the action 
is not a new quantity; it must be a disguised version of one already known, and the 
minimising principle can only be a disguised version of Newton’s laws. Or, of course, 
it could be the other way round: the action principle is fundamental, Newton’s laws 
follow from it, and we just happened to have discovered them first, which, given the 
origin of mechanics in astronomy was surely inevitable. Either way, the philosophical 
implications Maupertuis drew were ultimately spun out of a misunderstanding. 

Before we dismiss them, however, it is worth observing that they are curious. It 
is not difficult to imagine a particle feeling a force at every instant and responding 
to it. Itis much harder to imagine a particle considering every possible path between 
two points before deciding which one to take. In the first case, one can imagine the 
particle needs no help deciding what to do, but in the second case, one is tempted 
to imagine it appealing to an all-knowing higher authority, which is the essence of 
Maupertuis’s theological argument. 

That said, we are left with the problem of showing the equivalence of the principle 
of least action and Newton’s laws; however, a valid modern derivation requires more 
theory than we have at present.° 

There is also a strong pragmatic reason for preferring the principle of least action. 
It can be much easier to use in problems in dynamics because it fits very well with 
the framework of generalised coordinates, to which we now turn. 


7.4 Euler’s Later Approach 


In his paper (E420, [97]), Euler returned to the calculus of variations and derived its 
fundamental equations in a way that pointed the way to all future treatments. Given the 
problem of finding an extremal of an integral involving function y(x), Euler supposed 
there was a one-parameter family of functions y(x,t), such as y(x,t) = y(x) + 
tV (x), so that y(x) can be approximated arbitrarily closely. He then considered the 
variation in f: 


evaluated at tf = 0, and argued that at an extremal this variation should vanish. 


Tt is nicely explained in the Notes for the Harvard course Mechanics 151, see (http://www.people. 
fas.harvard.edu/~djmorin/chap6.pdf). 
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ahs ie rene! of a function Z(x, y) is, as he wrote, dZ = Mdx + Ndy, where 
M= 9% and if a Nace in y is all that is considered then dx = 0 anddy = 2 dt, 
so one can write 4 = N ay 

He wrote higher derivatives as 


_ Op _ Op | ay _ oq ap a’y 
ae Ce = Age edge ax dx? dx3?’ , 
and 
Op a*y dq _ ary or aty 
dt dxdt’ ot d2at’ dat ax3ar 
So now, if 


dZ = Mdx + Ndy + Pdp+ Qdq+Radr+-:-- 


then, holding x fixed, so dx = 0, the infinitesimal variation in y produces 


oy ap a*y ary aty 
dy = ae ie. dp = —dt = ——dt, dq = ———dt, dr = ——~—dt, 
a ae agar” “4 Bexar  ~ Bxat 
So, the variation in Z is given by 
OD is HO dp dt+ Ox a aydt +R Eee 
ot ot dxdt 03xdr 


The variation of an integral goes like this: 


aZ OZ 
gf zdx = f szdx = f Paras = ar [| Fas. 
ot ot 


f a2 


Applying this to the expansion of 5, 


given by the power series 


arf (vas + P22 ax 4 QP 74x +: .)). 


He then integrated by parts, restricted attention to variations that vanish at the 
boundary, and deduced that for an extremal 


dy dP @o 
d Pern Vane 
ar | 25 (w ae + a2 ) 0, 


and therefore that the Euler-Lagrange equation holds: 


Euler deduced that variation of the integral is 
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He discussed the geometric meaning of these assumptions in a section at the end 
of the paper, and remarked that dy = 0 means that y(x, t) and y(x) agree at the 
boundary points, dy = O means that they have parallel tangents at the boundary, 
and so on. 

Euler also showed how to extend the subject to find the equations governing the 
extremal of integrals of functions of two variables. 


7.5 Brachistochrone and the Calculus of Variations 


The Euler-Lagrange equation for an integrand F(x, y, y’) is 


d 

Pes = F, = 0. 
Remember that this involves the total differential with respect to x, not the partial 
derivative, so we have for any differentiable function G 


d / / i 
Woe yy) =G,+Gyy+Gyy”. 


Now for the brachistochrone, or curve of quickest descent. The problem is to find 
the curve along which a frictionless mass point sliding along the curve will descend 
under gravity (acting in the y direction) from A = (xo, 0) to B = (x1, y;) in the 
shortest time, given that its initial velocity is zero. When the point has descended a 
distance y its vertical velocity will be ./2gy. At that moment, in an instant of time 
dt it moves a distance ds, where 


d 2 
ds’ = (dx +. dy*)= (: z () dx? = (1+ y")dx?, 


so at each instant of time 


1 72 
dt = ay dx. 


2gy 


x1 1 72 
T= if ames dx. 
xm VY 28y 


2 


So the time of descent is 


l+y 
2gy — 


We write F(x, y, y’) = 
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To compute the Euler-Lagrange equation, seth = (1 + y?yt/2 and (2¢)~1/? = a, 
so 


F(x, y, y) =ahy—!””, 


Now calculate hy, th 
equation simplifies to 


: thy, Fy, and Fy, and verify that the Euler-Lagrange 


1 1 
yyy + shy” = Wy"y + ohh, 


and therefore to : 
72 ” 72 
(y —A)y"y = she —y"). 


All is not lost, because h* — sy? = 1, so the Euler-Lagrange equation becomes 


ae ee 
VIS Gh = 5 Se) 
Verify that the equation 
" 1 / 
“em ely ) 


is obtained by differentiating 
yd+y) =k, 


where k is a constant, and deduce that 


Set y = k sin?(z) and deduce that the Euler-Lagrange equation has become 
2k sin” zdz = dx, 
and so 


k 
a 5 — sin 2z). 


So x and y have been expressed in terms of a parameter z, and the brachistochrone is 
found to be a cycloid obtained by a point on the circumference of a wheel of radius 
k that rolls on the x-axis. 
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7.6 Generalised Coordinates 


This material is here for later use, in Chaps. 24 and 25. 

Lagrange introduced the method of generalised coordinates in mechanics in a 
memoir of 1788, when he returned to a problem he had looked at in 1764 on the 
libration of the Moon.’ The virtue of his method was that it enabled him to choose new 
variables with which to analyse libration, and this led to new differential equations 
that led to solutions that were easier to interpret. 

Equally importantly, in those intervening years, as Fraser [105] discusses, 
Lagrange moved away from relying on the principle of least action as a fundamental 
principle in physics. This principle, as Maupertuis and even Euler had described it, 
had a metaphysical aspect that Lagrange increasingly disliked. He much preferred 
to put his trust in formal, algebraic arguments. By 1788, when Lagrange published 
his Méchanique Analitique, he disparaged the use of such phrases as “least action” 


as if these vague and arbitrary denominations comprise the essence of the laws of mechanics 
and can by some secret virtue establish in final causes the simple results of the known laws 
of mechanics. 


He might have picked up this attitude from his mentor, d’Alembert. Or, as Fraser 
speculates, he might also have come to realise that his approach, which grew out of 
d’ Alembert’s, was more general, which it is because it does not require the forces in 
a problem to be given by a potential function. The calculus of variations remained 
fundamental to Lagrange’s approach, not because it led to the formulation of a phys- 
ical problem as the stationary value of an integral but because it led to a formulation 
in terms of differential equations. 

The manner in which Lagrange presented his new theory of mechanics is not 
easy to read or to describe, and in the absence of a thorough modern account that 
makes it easy to read him carefully and accurately I have chosen to follow Liitzen 
({192], 640-642) and Pulte [231] and to indicate the outlines of his achievements 
while suppressing the details of his methods.® 

We shall suppose that what is at issue is the motion of n particles, and that the 
jth particle has mass m ; and coordinates (x;, y;, z;) with respect to some Cartesian 
frame of reference. Lagrange introduced new variables q1, g2,...,9¢n, N = 3n, and 
supposed that every x;, y;, and z; is a function of the new variables q), g2, ..., 9. 

It is clear that in principle any system of equations that expresses the motion of the 
n particles can be written as a system of equations in the new variables. Specifically, 
consider the kinetic energy of the system, 


7Libration is the slow oscillation in the motion of the Moon that enables us to see a little more than 
half of its surface (about 59%). It is largely a result of the elliptical orbit of the Moon. d’ Alembert 
had published theoretical papers on it in 1761, and Cassini and Mayer had published observational 
accounts. 

8For the original treatment, see Lagrange Mécanique analytique [178], reprinted in Lagrange Oeu- 
vres 11, Part 2, Sect. IV, especially pp. 334 and 336. 
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_ ia a8 es 
—_ 5 Mh; + Hj +B). 
j=l 
Assume also that the forces are conservative, which means that they are given by 
the gradient of a potential function U(x1,..., Z,). Then T will be a function of the 


variables qj, ...-,9Nn;91,---n, and U will be a function of the variables qi, ..., gn. 
What Lagrange showed was that the equations of motion can be written as 


d aT aT aU 
dt dqj  9qj Aqj 


Later writers introduced the ‘Lagrangian’ L = T — U interms of which the equations 
of motion take the form 


Lagrange was at least clear about the advantages of his method. Even the simplest 
mechanical problem expressed in rectangular Cartesian coordinates is tiresome to 
convert to a coordinate system better adapted to the problem, as, for example, spher- 
ical coordinates often are. The new coordinates are entirely general, so the equations 
are coordinate-free, and symmetries of the problem turn up as simpler systems of 
equations. 


Because these equations can have various forms that are less simple or more simple, and 
above all easier to integrate, it is not a matter of indifference in which form they are presented 
at the start; and it is perhaps one of the principal advantages of our method that it always 
provides the equations for each problem in the most simple form relative to the variables 
that it employs, and puts one in a position to judge in advance which are the variables to use 
that will most simplify the integration. Here, for this purpose, are some general principles 
that one will see applied in what follows to the solution of different problems. (Lagrange, 
Oeuvres Vol. 11, 336-337.) 


7.7 Exercises 


1. Prove that the shortest curve joining two points in the plane is the straight line 
between them. 
2. Show that the curve in the (x, y)-plane joining the points (—a, b) and (a, b) that 
produces the surface of least area when rotated around the x-axis is the catenary 
y = acoshx fora suitable value of a. This is equivalent to minimising the integral 
a 1 12)1/2 d 
fry + yi dx. 
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Questions 


1. If Newton’s laws and the principle of least action give the same answers to the 
same problems in dynamics, what is the difference between them? 

2. Find and enjoy articles on Jakob Steiner and the isoperimetric problem. 

3. The physicist Richard Feynman greatly appreciated the principle of least action 
and made it the basis of his theory of quantum mechanics. Find out what you can 
about his views and arguments—he provided numerous accessible accounts. 


Chapter 8 ®) 
Monge and Solutions to Partial ie 
Differential Equations 


8.1 Introduction 


Lagrange’s account of the solution of first-order partial differential equations is, at its 
core, formal and algebraic and not easy to understand. The account that has become 
the basis of all further elementary accounts was published in 1809 by Monge, in 
his Application de l’analyse a la géométrie—see the Addition, pp. 367-414—and 
it brings out his remarkable geometrical gifts. Here, we look briefly at its origins in 
Monge’s earlier work, and how he tried to extend these ideas to second-order partial 
differential equations. 


8.2 Monge and First-Order Partial Differential Equation 


Monge (see Fig. 8.1) gave two accounts of how to solve partial differential equations 
in two memoirs in the Histoire de l’Académie Royale des Sciences for 1784 (published 
in 1787). The first eliminates arbitrary constants and arbitrary functions from an 
equation by repeated differentiation to show that the result is a partial differential 
equation. The second paper then reverses this idea and shows how arbitrary functions 
arise in the solutions of partial differential equations. 

In §3 of his (1787b), Monge took up the quasi-linear first-order partial differential 
equation: 

Mp+Nq+L=0, 


where M, N, L are functions of x, y, and the unknown function z, and p = z, and 
q =%y- 

He argued that the partial differential equation cannot be solved for p and q, yet, 
paradoxically, it seems that it can be. For, using the equation dz = pdx + qdy, one 
immediately obtains these two equations for p and q: 
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Fig. 8.1 Gaspard Monge 
(1746-1818), artist unknown 


Mdz+ Ldx = q(Mdy — Nadx), 
Ndz+Ldy = p(Ndx — Mdy). 


Monge’s way out of this apparent impossibility was to argue that these equations are 
to be understood as a set of restrictions on any function z that satisfies the partial 
differential equation. They say that the system of differential equations 


Mdy —Ndx =0, Mdz+Ldx=0, Ndz+Ldy=0 


can be solved simultaneously (any two imply the third) and their solution, for any 
arbitrary point (xo, yo, Zo), defines a curve through that point that lies in the solution 
surface. In other words, all solution surfaces through that point have this curve in 
common. 

To obtain the solution of the partial differential equation Monge then said that if 
two of these equations, or two of their consequences, can be integrated explicitly, 
say to provide equations of the form 


f(x, y,z) =a and g(x, y,x) =b, 


where a and b are constants, then the complete integral of the partial differential 
equation will be of the form 


f(x, y,z) = o(g(x, y, 2)), 


where ¢ is an arbitrary function. 
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8.2.1 A Comparison with the Modern Account 


We write the first-order quasi-linear partial differential equation in the form 
ap+bq=c, 


where 
a=a(x,y,z), b=b(x,y,z), c=c(x, y, 2), 


and, as usual, p = zx, q = Zy. 

The solution to this partial differential equation will be a surface with an equation 
of the form z = z(x, y), and any tangent to such a surface at the point (x, y, z) is of 
the form (dx, dy, dz), which is normal to (p, g, —1), so the curve 


(aQx(t), yt), 2), BAM, VO), 2), eM), y(t), 2), 


where x(t), y(t), z(t) are functions of a variable t, is everywhere tangent to this 
surface, and so the curves that satisfy 


dx:dy:dz=a:b:c 


or 


dx dy _ dz 
dt 


= ; = ="G 
dt 


’ 


lie in the surface, and indeed fill it out. They are called the characteristic curves and 
they project down onto curves in the (x, y) plane that were called the characteristic 
curves before. 

It is a simple, instructive, and reassuring exercise to connect these equations to 
the equations that Monge derived in 1784 (write everything in terms of £). 

This suggests that we think of directional vectors at each point of space: at the 
point (x, y, z), there is the vector (a(x, y, z), D(x, y, z), c(%, y, Z)). So if we are 
given a curve I" in space in the form (x(s), y(s), z(s)) say for 0 < s < 1, then the 
characteristic curves through the curve I, which we assume are not tangent to I’, 
will form the solution surface, and this is indeed the case—I omit the proof, but note 
that there is something to prove. 

It is helpful to give an example. We take the partial differential equation 


zgpt+q=l, 
with initial conditions 
x=S, y=coss, ,z=sins, O<s<l. 


The transversality condition to be met is that along this curve 
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and we have 
— sin? s —coss #0, 


which is correct for 0 < s < m/2 (indeed, until coss = 1/20. — 5). 
The solution of the partial differential equation is then given by solving the system 
of ordinary differential equations 


dx dy dz 
Sieg, ea, (SS Sd, 
dt dt dt 


for which ; 
rast tart+B, yatt+y.z=tta, 
and finding the solution that meets the initial conditions, which are that when tf = 0 
B=s,y =coss,a =sins. 


So, the solution of the partial differential equation that meets the initial conditions 
is 


1 
x= 5 ttsins +s, y=t+coss, z=f+sins. 


8.2.2. The General First-Order Case 


Before we return to Monge, it is instructive to see how far one can go pursuing his 
method but on the general first-order partial differential equation. 
We write the partial differential equation in the form 


F(x, y,Z, p,q) = 0. 


The aim is to get equations for dx, dy, and dz. We shall also need equations for dp 
and dq, which we did not need before because p and q were presented linearly then. 
We can differentiate this equation with respect to p and obtain 


Poeaae ri (8.1) 


As with the quasi-linear case, it is helpful to think of a surface with an equation 
Z = 2(x, y) that satisfies this partial differential equation. Let (xo, yo, Zo) be a point 
on this surface, then the tangent plane to the surface at that point satisfies the equation 
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Z— 2 = p(x — x9) + q(y — yo), 


If we differentiate the equation for the tangent plane with respect to p, we obtain 
dq 
O=x-—xX+Y-y)—_, 
dp 
and by thinking of (x, y) as infinitesimally close to (xo, yo) we write this as 


d 
dx + dy! =0. (8.2) 
dp 


We eliminate “a from Eqs. (8.1) and (8.2) and deduce that 


dx _ dy 


Fy F; 
A nice piece of elementary algebra tells us that therefore 


dx dy _ dz 
Fy Fy PFyp+qFy’ 


(8.3) 


or, equivalently, that 


dx dy dz 
dt = Fy, oe = Fy, ade = pF, + qFy. 
These are equations for curves that lie in the solution surface, but they are not enough 
to determine the solution surface given some initial data, because the equations still 
contain p and g. Wecan of course assume that locally without much loss of generality 
that we can solve the partial differential equation for g and obtain 


qd = (Xo; Yo, Z0, P)- 


This is an equation for all the planes at the point (xo, yo, Zo) that envelope a cone 
that touches the solution surface at that point. It became known as the Monge cone. 
However, purely formally, we have 


dp dx dy 
at at wa dt Pally + Py fas 
dq dx dy 
de dt + ay dt Gey + ay Fa, 


and from the partial differential equation we obtain by differentiating with respect 
to x and y 
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Fo + F,p + Fypx + Fadx = 0, 
Fy + Fg + Fy py + Faqy = 9. 


So, the previous two equations can be written as 


—=-F,— F-p, 
dt : zP 
dq 

—--F. F, 
dt , 4 


These equations, and the three before, give a set of five that reduce the solution 
of a first-order partial differential equation F(x, y, z, p,g) = 0 to little more than 
algebra: 


dx 
= FP (8.4) 
d 
— = F, (8.5) 
dz 
<= pF +aF, (8.6) 
d 
— SS es (8.7) 
d 
= =i 4k (8.8) 


Geometrically, these equations determine a curve that lies in the solution surface 
and a family of tangent planes that are tangent not only to the curve but—this is the 
contribution of the equations for = and “also to the surface. They define what 
came to be called a characteristic strip. 

The solution of the partial differential equation then proceeds much as before. An 
initial curve in space is given, and if it is crossed by a family of characteristic strips 
that are never tangent to it then these strips determine a unique solution, at least in a 
neighbourhood of the initial curve. 

For a statement of the existence and uniqueness theorem for the first-order partial 
differential equation 

F(x, y,Z, p,q) =0 


and a sketch of its proof, see the Appendix (Chap. C). 
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8.3. Monge on General First-Order Equation 


Monge reworked and extended this analysis in his [202]. His argument demon- 
strates his acute geometrical vision, but for that reason, it is not easy to follow. The 
comparison with Cauchy’s later, analytic method (see Sect. 31.1) is instructive both 
mathematically and historically. 

Monge first satisfied himself that any problem involving the envelope of a one- 
parameter family of surfaces is a problem involving a first-order partial differential 
equation: 

F(x, y,Z, p,q) = 0. 


To do so, he began with a one-parameter family of surfaces 
G(x, y,z, a, B) = 0, 


where # is a function of a, say 8 = y(a@), and considered what he called the envelope 
of these surfaces. 

If Sy is a surface in the family, and S is the surface they envelope, then S, touches 
S along a curve C, that Monge called a characteristic because it characterises the 
contact of S, and S. We are to think of the surface S, moving, changing its shape as 
it goes, and sweeping out a surface S. At each moment, Sy and S touch along Cy, so 
we can also think of C, as sweeping out the surface S. 

Monge considered the tangent plane to the envelope S at a point P and noted 
that there are distinguished directions at the point. The tangent plane can roll in 
many ways, but two stand out: along the characteristic, and about the tangent to the 
characteristic (thought of as an axis). In the first case, the tangent plane stays on 
the same characteristic; in the second case, it pushes in a direction that leaves the 
characteristic and defines a new curve that Monge called a trajectory. 

Thus motivated, he considered a partial differential equation of the above form. 
Any solution of it will be a surface and locally therefore of the form z = z(x, y). 
Differentiating F gives 


Xdx + Ydy + Zdz+ Pdp+ Qdq =0, 


and because one always has 
dz = pdx + qdy (8.9) 


Monge deduced that 
(X + pZ)dx + (Y + qZ)dy + Pdp+ Qdq =0. (8.10) 
He then looked at the characteristics with parameters a and a + da, which 


amounts to letting 
dp=0, dq=0, 
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and deduced that 
(X + pZ)dx+(Y+qZ)dy = 0. (8.11) 


which defines the projection of a trajectory on the (x, y)-plane. This equation is 
equally well obtained by stipulating that Pdp + Qdq = 0, so any surface satisfying 
this condition will touch the curve. 

Varying the tangent plane by rolling it along the envelope generates a devel- 
opable surface, and this allowed Monge to differentiate dx = pdx + qdy keeping 
dx, dy, dz fixed. In this way Monge obtained the equation 


dpdx +dqdy =0 


for the line common to two ‘neighbouring’ tangent planes. If in particular the tangent 
plane rolls along the characteristic, and therefore rotates about trajectory, then the 
value of a is given by Eq. (8.11) and so 


(X + pZ)dq —(Y + qZ)dp = 0. (8.12) 
Similarly, if the moving plane rolls along a trajectory, then 
Pdy — Qdx = 0, (8.13) 


which is an equation for the characteristic. 

So, Monge concluded, the Eqs. (8.9), (8.10), (8.12), and (8.13), belong to the 
characteristic. 

Monge then observed that these four equations involve the differentials dp, dq, 
dx, dy, dz and that one can deduce ten equations each of which involves only two 
differentials. He listed these equations on p. 380 but, as he pointed out, only four are 
independent: 


dx dy dz dp dx 


PQ Pp+Qp X+pZ Y+qZ 


(8.14) 


When it came to solving these equations, Monge dealt first with what he called 
the linear case 
Pp+ Qq=L, 


where P, Q, and L involves only x, y, and z. In this case, the equations that are 
necessary are only 


Pdy— Qdx =0, Pdz=Ldx=0, Qdx —Ldy =0, 
which describe the projections of the characteristics on the three coordinate planes. 


He noted that if these differential equations are easy to solve then one should do 
so, but if they are more intractable, then because they define a curve, one can regard 


8.3. Monge on General First-Order Equation 103 


x, y, and z as all being functions of a single variable, that one might as well take to 
be z. In which case, eliminating y between these equations leads to a second-order 
ordinary differential equation for x as a function of z. He then showed how, in this 
case, the envelope can be regarded as swept out by a curve that moves and changes 
its form in space in a way that is prescribed by the partial differential equation. 
Monge then turned to the general case. His method was to see what can be done 
by regarding everything possible as a function of p. Four of the ten equations he 


had listed before involve either @, “@, @, or “ in an equation with x, y, Z, p,q. 


Systematic elimination produces ‘a third-order Goinary differential equation for q 
as a function of p that involves only p and q but not x, y, or z. 

Monge then, in a way, I shall not describe, showed how to produce from this 
equation the solution to the partial differential equation. But he admitted that this 
process could lead to long analytic difficulties and that it would be more useful to 
turn to interesting special cases, with which he proceeded to conclude his account. 

I omit Monge’s lengthy description of the most general case, because, as he put 


it himself (p. 409): 


The geometrical considerations on which we have based the study of the equations for the 
characteristics are familiar to students at the Ecole Polytechnique, but they can be hard going 
for other readers, and we shall therefore derive the same equations by a purely analytical 
process. We shall begin with the case of linear equations, and then pass to the general case. 


Monge also discussed the difficulties that arise in the many cases in which the 
equations for the characteristics are not immediately integrable and then turned his 
attention to partial differential equations that are reducible to the above (quasi-) linear 
form. Among these was a class of developable surfaces with equation 


F(z — px —qy, p,q) =0 


that, Monge remarked, 


includes a great many of those that M. Lagrange has treated in his beautiful work on particular 
integrals, printed in the Mémoires de l’Académie de Berlin for the year 1774. 


In fact, Monge’s equations are enough to solve the equation F(x, y, z, p,q) = 0. 
Given a point (x0, yo, Zo) and po, go such that 


F (xo, Yo. Zo, Po, go) = 9 


the equations determine a curve through the point and a set of planes, one for each 
point on the curve, along which the planes are tangent to a surface z = z(x, y) that 
satisfies the partial differential equation. This curve is a characteristic curve, and the 
curve and these tangent planes define a characteristic strip. If the initial point lies 
on a curve that is transversal to a family of characteristics, then the family of planes 
envelope a surface z = z(x, y) that satisfies the partial differential equation. 
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8.4 Monge on Second-Order Partial Differential Equation 


In his ({201], Sect. 8) Monge extended these methods to the study of various second- 
order partial differential equations, which he wrote in the form 


Ar+ Bs+Ct+ D=0, 
where A, B, C, D are arbitrary functions of x, y, z, p,q. He now argued that the 
partial differential equation cannot yield expressions for r, s, and t, but the use of 
the equations! 
dp=rdx+sdy and dq=sdx+tdy 

seemingly leads to these expressions for them: 

Bdpdy + Cdqdy — Cdpdx + Ddy* = —r(Ady* — Bdxdy + Cdx’), 

Adpdy + Cdqdx + Ddxdy = s(Ady* — Bdxdy + Cdx’), 
Adpdx — Adqdy + Bdqdx + Ddx? = —t(Ady* — Bdxdy + Cdx’). 


These equations cannot hold in general, so if they hold simultaneously it must be 
along certain curves in the solution surface. If these equations 


Ady” — Bdxdy + Cdx* =0, 
Bdpdy + Cdqdy — Cdpdx + Ddy’ = 0, 
Adpdy + Cdqdx + Ddxdy = 0, 

Adpdx — Adqdy + Bdqdx + Dax? = 0, 
hold simultaneously (any two imply the other two), and if two of these equations have 
the solutions v = a and u = b, where a and b are arbitrary constants of integration, 
then the general solution of the partial differential equation is v = g(u), where ¢ is 
an arbitrary function. 

He gave some examples of how his method works in practice. In §12 he supposed 

that A, B, C, D are constants. The equation 


Ady” — Bdxdy + Cdx? =0 


gives rise to equations 


az _ az 
,and t = 53. 


2, F 
‘Recall that r = a »S = aay 
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dy—kdx =0 and dy—k'dx =0, 


and so 
y—kx=a and y—k’x=d’, 


where k and k’ are the roots of the equation Ak* — Bk + C = 0. The equation 
Adpdy + Cdqdx + Ddxdy = 0 


then becomes 
Akdp + Cdq + Dkdx, 


which implies 
Akp + Cq + Dkx = b. 


Monge deduced that a first integral of the partial differential equation is 
Akp + Cq + Dkx = o'(y — kx), 


and another is 
Ak' p + Cq + Dk'x = W'(y — k'x), 


where the functions g and y are arbitrary, so he deduced that the general solution of 
the original partial differential equation is 


1 
Az+ 5 Dx = o(y — kx) + Wy —k’x). 


Monge did not distinguish the cases when k, k’ are real and when they are complex 
conjugate (or even when they are equal), and this indifference to concerns raised by 
Euler forty years earlier may be down to Monge’s optimism that the problem Euler 
had pointed to could be solved more easily than was to turn out to be the case. 
Monge’s next example ($13) was the partial differential equation for surfaces 
generated by a line moving in space while remaining parallel to the (x, y)-plane: 


q’r —2pqs + p*t = 0. 
His method leads to the equations 
q?dy? + 2pqdxdy + p’dx? =0 and 
a 2 
q dpdy + p-dydx. 


The first leads to 
(qdy + pdx) =0, 
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which implies that z = a, a constant. The second then implies that 
pdq —qdp =0 


and so g = bp where b is a constant, and after a little work the solution of the partial 
differential equation is found to be 


x + yo(z) = Wz), 


a result already known. (Exercise | below asks you to derive the partial differential 
equation generated by a line moving as described.) 


8.5 Lagrange at the Ecole Polytechnique, 1806 


In his lectures in 1806 at the still-new Ecole Polytechnique Lagrange gave another 
account, which was very close to the one Monge had given some years before. He 
began Lecture 20 by observing that there are three immediate consequences of an 
equation F(x, y, z) = 0, which are obtained by finding the total differential of F 
and taking the partial derivatives with respect to x and y: 


F.dx + Fydy + F.dz = 0, 
F, + Fz, = 0, 
F, + F,Zy = 0. 


Lagrange then showed that these are useful when looking for a solution of a 
first-order partial differential equation, by giving two examples. In the first 


Ze + Mz y= N > 
where M and N are constants. Lagrange introduced the ‘primitive’ equation 
z—Nx = o(y — Mx), 
and differentiated it partially with respect to x and y. The resulting two equations 
imply that z = Nx + g(y — Mx) is a solution of the partial differential equation. 
In the second example, M and WN are now regarded as functions of x, y, z. If the 


solution (or primitive equation) is of the form F(x, y, z) = 0 then partial differenti- 
ation of F with respect to x and y implies that 


dF = F,+ MF,+NF, =0, 
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and by the first of the three consequences above, regarding x, y, z as functions of a 
variable rf, the partial differential equation becomes 


dF = (dy — Mdx)Fy, + (dz — Ndx)F, =0. 


So, said Lagrange, the solution of the partial differential equation is found by solving 
(the fourth and fifth consequences) 


(dy — Mdx) = O and (dz — Ndx) =0. 


This is exactly what we have already seen in Monge’s treatment of the quasi-linear 
equation. 

The solution of these ordinary differential equations introduces two arbitrary 
constants a and b, and the result of eliminating the variable z from them will be a 
second-order equation in x and y in which one can set dx = 1 if one wants to regard 
y as a function of x, or dy = | if one wants to regard x as a function of y. Then z 
can be found as a function of x and y. 

The answer is now given as an implicit function F(x, y, z) = 0 in which the 
constants a and b appear, which Lagrange wrote as ®(a, b), highlighting that a and 
b are functions of each other. Moreover, the function F involves only these two 
constants (other than the ones in M and JN), and so eliminating two of the variables 
x,y,z will eliminate the third (because dF = 0 and so F cannot be a function 
of the remaining variable). The primitive equations yield expressions for a and b as 
functions of x, y, z,saya = P(x, y, z) andb = Q(x, y, z), so the primitive equation 
becomes ®(P, Q) = 0. 

Then he returned to the case of two independent variables and, as it were, ran 
the above argument backwards to show how first-order partial differential equations 
arise by eliminating the two parameters a and b from an equation of the form 


F(x, y,z,a,b) =0, 


which he called the complete primitive equation (he required that a single differ- 
entiation cannot eliminate both parameters at the same time). This equation, he 
showed, leads to a more general one involving an arbitrary function, on supposing 
that b = g(a) and letting a satisfy 


C) 
aa p(a)) = 0. 
a 


Finally, the singular primitive equation is obtained by letting a and b satisfy 


a a 
—F=0, —F=0. 
da ob 
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He gave this example. The partial differential equation 
Z=X%q + VZy 
has the complete primitive equation 
zZ=ax+by. 


To obtain the primitive general equation, he set b = g(a) and took derivatives with 
respect to a only, thus finding 


z=ax+yo(a), x + yo'(a) = 0, (8.15) 


from which it was necessary to eliminate a. Because was arbitrary, this led to an 
infinity of different complete primitives, each with two arbitrary constants. If, for 


example, 
2 


(a)=A-— 
a)=A-—, 
i" 4B 
the procedure just outlined results in 
2Bx Bx? 
a = — and z= Ay+ —— 
- 2y 


as a new form for the primitive equation. 

In this way, he remarked, one can find as many complete primitives as one likes, 
but the general primitive equation is never among them. In the present case, it is 
impossible to find a function g(a) such that the resulting primitive equation is z = 
Ax + By. I omit the proof. 

Lagrange then discussed the theory of envelopes and its role in the theory of partial 
differential equations and remarked (p. 348) 

One can see, in the writings of Monge, the theory of the generation of these surfaces and 


the equations that can represent them developed to its full extent and with particular and 
ingenious considerations that belong to them. 


He then returned to the solution method for first-order partial differential equations 
that he had discussed 34 years earlier, in his [173], and noted that some difficulties 
remained to be resolved that, moreover, he had not been able to treat in the Théorie 
des fonctions.” 

Lagrange’s method for tackling first-order partial differential equations is fre- 
quently presented in textbooks as the Lagrange—Charpit method or even as Charpit’s 
method. Many historians say that we know almost nothing about Charpit, and report 


?Lagrange also considered linear equations in more than two independent variables in this paper. 
This was the territory that no one else had mastered, and he indicated that his approach generalised, 
but it cannot be treated here. 
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only that Lacroix said that Paul Charpit submitted a memoir in 1784 on the general 
solution of first-order partial differential equations, but it was never published and that 
Charpit died that year (28 December 1784).° However, as Grattan-Guinness (1990, 
151) explained, the manuscript is not lost; indeed two copies survive. Lagrange saw 
the manuscript in 1793, and the Lagrange—Charpit method is essentially his more 
rigorous account of what Charpit wrote, which Lagrange presented in his Lecons sur 
le calcul des fonctions ({177], Lecture 20).4 


8.5.1 Lacroix’s Traité (1798) 


Monge’s ideas formed the basis of the rather superficial account given by Lacroix, 
the great textbook writer of the period, in his Traité of 1798, but this indicates the 
difficult nature of the subject for even the best students of the time. He dealt with what 
we would call the quasi-linear first-order case and some examples of second-order 
equations. For the linear second-order partial differential equation 


AZxx + DZxy + CZyy = V(X, y) 


with constant coefficients a, b, c he factorised the equation for the characteristics and 
proceeded formally to a solution without caring whether the roots of the equation 


am? —bm+c=0 


were real or complex, but only that they were distinct. 
He noted that many equations escaped this analysis, notably the equation 


Lie = Zi 


but that in this case the equation could be solved by an infinite series of sums of 


terms of the form 
eixtmiy : 


Lacroix then commented on some remarks about the generality of the solution, 
noting that Laplace had been of the opinion that the equation could not be solved 


3On Paul Charpit de Ville Coer and his manuscript, see Grattan-Guinness and Engelsman [122]. It 
seems that Charpit came from Strasbourg to Paris in 1782, and became an assistant to Monge, who 
taught him solid geometry. He read an extract of his paper to the Académie des Sciences on 30 June 
1784, but nothing was done with it. Laplace acquired a copy, and he passed it on to Lagrange (who 
had been in Berlin in 1784) in 1793. Lagrange in due course sent it to Charpit’s friend Arbogast, 
who made a copy that is now in Florence. In 1798 Lacroix described Charpit’s paper in his Traité 
du calcul différentiel et du calcul intégral ({168], 496-497, 513-516). Curiously, his copy of the 
manuscript is shorter than Arbogast’s and seems less reliable. 


4Lagrange’s Lecons are in vol. 10 of his Oeuvres. Charpit’s name is not mentioned. 
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by an arbitrary function, but the Italian mathematician Pietro Paoli had disagreed. 
Evidently, Lacroix had little idea of the power inherent in series of that kind. Such 
was the situation when Fourier took up the problem, as we shall see in Chap. 10. 


8.6 Exercises 


. Explain why a line in space that is parallel to the (x, y)-plane has an equation of 


the form y = m(z)x + z (ignoring lines parallel to the x-axis). Eliminate m(z) 
by differentiating twice, and deduce that the equation of the surface generated by 
a line moving in space while remaining parallel to the (x, y)-plane is 


q’r —2pqs + pt = 0. 


. What is the Monge cone for a quasi-linear first-order partial differential equation? 
. Follow through Monge’s analysis for the first-order partial differential equations 


considered by Euler. 


. Follow through Monge’s analysis for the wave equation and Laplace’s equation. 


Questions 


1 


What does the near silence about the heat equation, even as a partial differential 
equation without any physical interpretation, say about the solution methods 
available around 1800? 


. To what extent does Monge’s geometrical analysis help you? How do you find it 


compares with the more formal account in Sect. 8.2.2 above? 


. How fair is it to say that Lagrange’s account of first- and second-order partial 


differential equations in two independent variables is Monge’s without the geom- 
etry? What advantages and disadvantages are there in the two approaches? 


Chapter 9 Mm) 
Revision Cheek for 


9.1 Revision and Assessment 1 


This chapter is given over to revision and discussion of the first assignment, see H.2 
in Appendix H. 


9.1.1 Comments 


The Assessment asked students to reflect on what they had studied either by imagining 
becoming a student of mechanics and dynamics around 1770, or more generally, on 
what is involved around 1770 in the study of partial differential equations. 

They were to do so by writing a letter (some years as an English professor writing 
to a student, some years as the student writing back to their former professor). I asked 
for a letter, not a history, to help them get into the way of seeing things through the 
protagonists’ eyes. No one can see the future, and historians shouldn’t try. And in 
fact, generally speaking, answers that began as a letter made it easier for the writer 
to engage with the developments historically. 

No comparison between the mechanics of Newton and Euler should diminish 
Newton’s remarkable achievements in his Principia Mathematica. His major work 
is a theory of celestial mechanics that delivered a remarkably accurate theory of the 
motion of the planets and their satellites, in which to a high level of detail, only the 
motion of the Moon around the Earth remained unaccounted for. 

Euler managed no comparable single achievement in this field, but he produced 
theories of motion for rigid bodies of any kind (Newton could only handle spheres, 
which he showed could be treated as points), and of fluids and gases. There is also 
his account of the vibrating string. 

Whereas Newton’s method was largely geometrical but intermittently invoked 
calculus-type series of arguments, starting from a set of three laws of motion, Euler 
always started with infinitesimal pieces of bodies and worked towards equations 
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of motion expressed in calculus terms as differential equations. He reformulated 
Newton’s laws of motion as equations of motion, and his calculus of variations was 
a completely novel approach to geometrical and mechanical questions, one rapidly 
improved by Lagrange. The central importance of differential equations, and the 
means of solving them, unite all of Euler’s work in this area. 

Perhaps nothing compares with the invention of (single-variable) calculus and 
the creation of celestial mechanics, but if anything does it might be Euler’s breadth 
of applications coupled with the generality of his (several-variable) calculus. But 
Newton’s ingenious methods were not what anyone could follow, and Euler’s were.! 

The origins of partial differential equations are the wave equation and its investi- 
gations by d’ Alembert, Euler, Daniel Bernoulli, and Lagrange. This led to the idea 
that its solutions are any functions of the form f(x + ct) + g(x — ct), but that led 
to deep questions about what is an arbitrary function. 

First-order linear partial differential equations were first studied by treating them 
like ordinary differential equations, so a good answer would go into a little mathe- 
matical detail about them because later methods were different. Note, for example, 
the advance in the theory of characteristics made by d’ Alembert. 

The theory of second-order linear partial differential equations was much less 
developed. Success with the wave equation came with Euler’s worry about (what 
came to be called) the Laplace equation. 

There are other aspects worth mentioning: fluids, the propagation of sound, and, 
perhaps, the calculus of variations. 


'A good source is an essay by Maronne and Panza entitled Euler: Reader of Newton. See https:// 
hal.archives-ouvertes.fr/hal-004 15933. 


Chapter 10 ®) 
The Heat Equation cies 


10.1 Introduction 


Fourier series are infinite series of sines or cosines that are used to represent a 
function. They had been used by Euler and others in the eighteenth century in the 
study of the vibrating string and in celestial mechanics, but Fourier’s name is rightly 
attached to them because of the great generality and utility he envisaged for them 
and the success he put them to in the study of heat diffusion. His strong claims for 
them were also to prove a valuable challenge to mathematicians who came after him 
and demanded more rigour in analysis. 


10.2 Fourier and His Series 


Joseph Fourier (Fig. 10.1) was born in France in 1768, and for much of his life he was 
caught up in the political transformation of France. When he was an assistant lecturer 
at the Ecole Polytechnique in 1795 he came to the attention of Gaspard Monge, who 
became a prominent supporter of Napoleon, and who selected Fourier for the French 
expedition to Egypt in 1798. When that ended in defeat Fourier returned to France in 
1801, but Napoleon, impressed by his organisational talents, made him the prefect 
of Governor of the Department of Isére, and, still impressed, a Baron in 1808. 

The defeat of Napoleon led to a difficult period in Fourier’s life, as the new regime 
came down hard on those it could accuse of having supported either the revolution or 
Napoleon, but he recovered and with the support of Laplace he became the permanent 
secretary of the Académie des Sciences in 1822 and was elected to the Académie 
Francaise in 1827. In 1830, he died from complications of an illness caught in Egypt. 
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Fig. 10.1 Joseph Fourier 
(1768-1830) by Amédée 
Félix Barthélemy Geille, 
after Jules Boilly, c. 1823. 
Portraits et Histoire des 
Hommes Utiles, Collection 
de Cinquante Portraits, 
Société Montyon et Franklin, 
1839-1840 


His book, Théorie analytique de la chaleur (The analytical theory of heat) came 
out in 1822. It is devoted to the topic of heat diffusion.* By this time, it was rea- 
sonably well known that heat was a state of the hot material, not a fluid substance 
permeating the material in an amount proportional to its temperature. Not much 
else was understood, however, and accordingly, Fourier made as few assumptions as 
possible about the nature of heat. Rather, he concentrated on formulating the way 
heat passes from one part of the body to an adjacent part in a very short interval of 
time. He argued that it was enough to suppose that the amount of heat that passes 
is proportional to the duration of the time interval, the infinitesimal temperature dif- 
ference between adjacent parts, and a certain function of the distance between the 
parts. He was able to examine homogeneous bodies of simple shapes with simple 
temperature distributions on their boundaries, say when the boundaries are kept at 
fixed temperatures. He also tested his results experimentally by heating simple shapes 
and measuring the temperature at various points and various times and found good 
agreement with his theoretical predictions.* 

One of his examples was a window—of an unusual shape, being infinite from side 
to side and top to bottom, but of finite thickness—kept at a fixed temperature on each 
side and warmer inside than out. In this case, the temperature drops linearly with the 
distance from the warmer side. Another of his examples was that of an oven. 

The problems that Fourier considered have two aspects. One concerns the flow of 
heat in the body, and he showed that that is described by a differential equation. The 
other concerned the temperature on the boundaries of the body, and although these 


'Tt was written in several stages and has a complicated publication history that Fourier described in 
its Preliminary Discourse. In 1812, a version of his theory won a prize of the Institut de France in 
Paris Academy, but Lacroix and Laplace were unable to overcome the objections of Lagrange and 
let Fourier’s account be published. Lagrange accepted that the correct equation for heat diffusion 
had been found, but not the generality of the solutions. For a rich introduction to Fourier and his 
work, see Grattan-Guinness and Ravetz [123]. 

?See the translation by A. Freeman in the Internet Archive. 


3See Grattan-Guinness and Ravetz ([123], 421-440) for Fourier’s account of 1807. 
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can be arbitrary they are easiest to handle if the boundary is made of simple shapes. 
Only if the boundary conditions, as these specifications are called, are simple can 
explicit solutions be found. 


We now turn to one of his examples to see how he formulated the mathematical 
equation that describes the flow of heat. 

Fourier considered semi-infinite bars with either semi-circular or square cross 
sections, in which one end is kept hot and the rest uniformly cool. The problem is to 
find how hot the bar becomes when it reaches a steady state. He argued that at each 
point in the interior of the bar, heat—measured by the temperature, v, a function of 
x, y, Z—passes through in each of the x—, y-, and z-directions. In §98, he considered 
an infinitesimal cube in the interior of the bar and stated that the amount that enters 
the face with sides dx and dy is —Kdxdy 5 2 evaluated at that face, where K is a 
quantity dceunined by the nature of the body, and what leaves the opposite face is 
—Kdxd y2 “ evaluated at that face.t The minus sign arises because heat flows from 
a hot boar to a cold one. 

What left at the second face was found by replacing z by z+ dz, and so the 
difference between what enters and what leaves is the difference in the values of 
—Kdydx (32) x +. Because the 
temperature is in a steady state, the sum of these quantities taken over r the three pairs 
of opposite faces of a cube is zero, and the resulting equation is 


Ov rn Ov " Ov 
Ox? Oy? sz? 


=0. (10.1) 


The distribution of heat in the body is described by the solution of this differential 
equation that satisfies the stated boundary conditions. 

A similar argument allowed Fourier to derive the equation for the distribution of 
heat in a body that is not in a steady state, so the temperature v = u(x, y, z, t) isnow 
a function both of position and time. He now argued (§142) that, by regarding the 
body as made up of little cubes, the differential equation (10.2) will hold for the flow 
of heat in any body. As before, the amount of heat leaving a cube in the z-direction is 
—Kdxdydz a , but now the sum over the pairs of opposite faces is proportional to 
the rate of change of temperature, which is ou The result (§ 128) is the heat equation: 


Cv uv Hv Ov 
K = . 10.2 
(3 Foy” x2) at ao) 


As before, solutions of this partial differential equation are required that satisfy 
some given boundary conditions (and, if v is a function of ¢, initial conditions). 
These conditions, however, generally forced him to suppose that the body has a 
simple shape, such as a cuboid, or one of a limited range of other shapes that can be 
handled by finding suitable coordinate transformations. 


4T set side Fourier’s remarks about what physical properties affect K. 
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To show how this could be done, Fourier had first to show how to find a general 
solution to the partial differential equation, and then show how to fit the general solu- 
tion to the boundary conditions. He began (§ 166) with the simplest two-dimensional 
case, a semi-infinite strip of a given width, 7 in suitable units, located between —7 /2 
and +7 /2, included between two parallel infinite sides at a given temperature 0 in 
some units which at its base has a temperature of 1. He chose coordinates (here 
relabelled to be more familiar) in which y measures the height above the base and 
x measures the distance of a point from the mid-line of the strip. The differential 
equation of the steady-state distribution, which now involves only two variables, is 


Ov Ov 
K{~—,+—>] =0, (10.3) 
y 


It is likely he was inspired by existing treatments of the wave equation. At all 
events, he looked for a solution of the form 


v(x, y) = f(x)g(y). 


This forces 


8 (y)/8(y) = —f" )/F@), (10.4) 
so both sides must be constant, say m2, and the solutions are of the form 
f(x) =cosmx, g(y)=e™. (10.5) 


The boundary conditions force m to be positive, otherwise e”” would become 
infinitely great, and if the solution is to vanish for x = +7/2 for all y, then m 
must be an odd integer. 

This led Fourier to contemplate solutions of the form, as he wrote ($169), 


ae cosx + be~* cos 3x + ce cos 5x + de~’” cos 7x + etc. (10.6) 
subject to the boundary condition at the base that 
1 =acosx + bcos3x +ccos5x + dcos7x + etc. (10.7) 


The infinitely many arbitrary constants are now to be determined from the bound- 
ary conditions. Fourier gave two methods. The first is an impressive tour de force, 
but the second one is much easier and has been used ever since, although with much 
more attention to the conditions under which it is valid. 

Fourier argued as follows. We have f(x) = 1, —7/2 <x < 1/2. The corre- 
sponding Fourier series is }7,_, dn COS NX, SO 


l= y An COSNX, 


n=1 
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so, multiplying with sides by cos jx and integrating 
m/2 m/2 
/ cos jxdx = / Yo an cosnx cos jx ] dx. 
—7/2 —7/2 f= 


The left-hand side is 
1 2 oe /2 2 7 : 
= sin jx|_2 = Fes 
J 


The quantity sin(j7/2) is zero when j is even, itis +1 when j/ is of the form 4k + 1, 
and it is —1 when j is of the form 4k — 1. So the left hand side is zero if j is even, 
itis + if j is of the form 4k + 1, and it is —¢ if j is of the form 4k — 1. 

The right-hand side is 


T 

5” 

so a; is is zero if j is even, it is + if j is of the form 4k + 1, and it is —+ if ; is of 
jn in 


the form 4k — 1. 
This gives the result 


4 1 1 
1 = —{cosx — ~cos3x + =cos5x+-:-- ], 
1 3 5 


as Fourier said. 
It gave him that for all values of y between —7/2 and +7/2 


-y Ney a 1 ty 
e > cosx — 3° *Y cos 3x + 5° * cos 5x — 7e * cos 7x + etc., (10.8) 
where on the boundary” 
4 1 1 
1 = —|{cosx — =cos3x + =cos5x —...}. (10.9) 
7 3 5 


Here is a graph of the sum of the first 105 terms of the series for the function 
F(x) = 7/4 (Fig. 10.2): 


1 1 1 
cos(x) — 3 cos(3x) + 5 cos(5x) — 7 cos(7x) +---. 


Note how small the difference between the sum of the series and the sum of its first 
105 terms has become. 


5Note that we have a series of continuous (indeed, analytic) functions that defines a function that is 
plainly not continuous. 
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Fig. 10.2 The sum of the 


first 105 terms of the Fourier 
series for F(x) = +7/4 a ee = 


0.55 


To reach these conclusions, he had observed in §220 that when j 4 k (the switch 
from cosine to sine is irrelevant here) 


« 1 1 1 . 
[ sin jx sinkxdx = 5 (; er sin(k — j)x rae sin(k + ps) ) = 0, 
(10.10) 
and that the integral is 7/2 when j = k. He accordingly deduced that the coefficients 
of the series can be found by multiplying the series by sin jx, for each value of j, 
and integrating. For, if 


I oo 
f(x) = 70 Se So bn sinnx, 


n=1 
then multiplying both sides by sin jx and integrating gives 
T T nr oO 
/ f(x) sin jxdx = ij ~do sin jxdx +f So bn sinnx sin jxdx. 
0 0 2 0 n=1 
This, he simply assumed, is equal to 
™ 4 oC T 
=adg sin jxdx + i b, sinnx sin jxdx. 
[, gasinindx+ > f j 


In this expression, all the terms te b, sinnx sin jxdx vanish except for the one in 
which n = j, and this one is equal to >. So 


7 


f(x) sin jxdx = 7b;, 
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and therefore, 
TT 


1 
bj = = f(x) sin jxdx. 


Similar results apply to series of cosines, to series of sines and cosines, and to series 
obtained when the period is different (such as 7 or 1). For example, changing the 
range of integration from (—7/2, 7/2) to (—7, 7), which is much more convenient 
for later use—when j #k 


= 0, 
~" 10.11) 


T 1 1 1 
i sin jx sinkxdx = 5 (; ane sin(k — j)x ay sin(k + ix) 


7 


and that the integral is 7 when j = k. 

He went on to claim that any function f defined on the interval [—7, 7] can be 
written as an infinite series of sines and cosines in any of these forms, depending on 
how they are to be continued outside the interval on which they have been defined: 


ioe) 
ef(xy= $0 + ¥° a, cosnx + by, sinnx (mixed series) 
n=1 


lo.@) 
efa= +do + ¥° a, cos nx (cosine series, which works only for even functions) 
n=1 


[o.e) 
efw= 5d + ¥° b, sinnx (sine series, which works only for odd functions) 
n=1 
The coefficients of the mixed series are given by the formulae ay = 1 ue St (x)dx 
and Lot iff 
1 f(x)coskxdx, by = — f(x) sinkxdx. (10.12) 
T T 


—T TT 


The result is that the heat equation can be solved, for bodies with simple shapes, 
by the method of Fourier series, and the answer is written as an infinite series of either 
sines, cosines, or both. The solution of the partial differential equation appears as an 
infinite series with largely arbitrary coefficients. The boundary conditions allow the 
coefficients to be determined, and a unique solution is exhibited. 


Or rather, we should say, Fourier claimed that this could be done. Very quickly 
his claim became a challenge to mathematicians to prove that it is correct—and this 
was to become a long and fascinating story that we can only begin to tell here. 

Fourier’s ideas generated quite some discussion in print. Siméon Denis Poisson 
rightly complained that Fourier’s methods for finding the coefficients in a Fourier 
series® “has not in fact been demonstrated in a precise and rigorous manner”. A 
decade later he objected (again correctly) that the fundamental assumption that an 
arbitrary function can be expanded as an infinite series of sines and cosines had 
not been proved, and Charles-Francois Sturm joined in, remarking that “Fourier and 


Poisson ([226], 46), quoted in Bottazzini ([21], 188). 
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other geometers seem to have misunderstood the importance and the difficulty of 
this problem, which they have confused with that of determining the coefficients”.’ 


10.2.1 Dirichlet on the Convergence of Fourier Series 


Peter Gustav Lejeune Dirichlet, who had got to know Fourier personally during a 
long stay in Paris, took up the subject of the convergence of Fourier series in the 
late 1820s, by which time it was a topic of some discussion, although Dirichlet said 
that he knew of no other attempt on the problem than Cauchy’s, which he found to 
be flawed.* In his opinion it was remarkable that a Fourier series expansion of an 
arbitrary function converges (which suggests that he doubted neither the existence 
nor the convergence of the series), and so he proposed to establish the convergence 
of a Fourier series directly, and to show that the series and the function agree. He 
succeeded, by a very fine application of Cauchy’s own € — 6 analysis and the theory 
of convergence, in showing that the Fourier series representation of a function that is 
piecewise continuous and piecewise monotonic on an interval converges and agrees 
with the function except at the point where the function jumps. At points where 


Jim f@) =a 4 6 = lim f(x) 


the Fourier series takes the value (a + ~). 

Dirichlet’s rigorous argument made it clear that to prove Fourier’s claim in any 
greater degree of generality would be hard work, and indeed later generations of 
mathematicians would discover that Fourier’s claim is in fact false in general and 
applies only to functions that do not oscillate wildly. 


10.2.2. Fourier Integrals 


Fourier also considered how heat diffused in an infinite bar under two distinct con- 
ditions. In the first, a part of the bar is raised to a temperature given by a function 
F(x), the rest being at temperature zero. In the second, one end of the bar is kept at 
a constant temperature. 

He began (§345) by supposing that bar is an infinite line, modelled by the positive 
real axis {x : 0 < x}, and the heated region is the interval [0, 1], the temperature 
being given by a function F(x) on [—1, 1]. He then supposed that the whole line 
is considered, and the data extended to the negative real axis by defining F(—x) = 
F (x). The problem is to be solved by a function v(x, f). 


7See Poisson ({227], 186) and Sturm ([250], 400). 
8See Dirichlet ({63] and, for a historical account in keeping with the present book, [127]). 
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The equation to be solved is 


Ov O*u 


a a 9 
Ot Ox? 


where f/ and k are constants determined by the physical properties of the wire. He 
—ht 


set v = eu and reduced the equation to 
Ou i ru 
Ot Ox?’ 


The equation is solved, for example, by 


_ 1,2 
u = acos(qx)e*"", 

where a and g are arbitrary constants and therefore, Fourier supposed, by an infinite 

sum of such expressions: 


—kq2 
u= Yoaj cos(qjx)e "4". 
j 


By supposing the successive qs vary only a little, this led him to look for a solution 
of the form 


u(x,t) = i f(q) cos(qx)e "dq, 
0 


where f(q) is an as-yet unknown function of g that can be found from the initial 
data, because 
u(x, 0) = F(x). 


This led him to this equation for f(g): 


F(x)= : f(q) cos(qx)dq, 


which he called ($346) “a remarkable problem whose solution demands attentive 
examination”.” 

He solved this equation by going back to thinking of the solution as an infinite 
series and using his old method for integrating in a way that picked up a single term 


of the series. This led him to the solution 


f@= -{" F(x) cos qxdx. 
T JO 


°We recognise this as the introduction of “Fourier transforms”, whose properties Fourier did indeed 
begin to study. 


122 10 The Heat Equation 


Therefore, the solution to the heat diffusion problem in this case is 


2 [o.e) [o.e) 2 
u(x,t) = > / (/ F(x) cos ads) cos(qx)e" ‘dq. 
0 0 


Fourier then ($348) considered the special case when F(x) = 1 when—1 < x < 1 
and F(x) = 0 otherwise. The x-integral becomes 


2 ii 2 sing 
= cos gxdx = — : 
wT Jo T q 


Here, Fourier remarked that the discontinuous function F(x) has been expressed by 
a definite integral. 

To deal with a semi-infinite bar heated at one end, Fourier found it convenient 
(§351) to think of an infinite bar heated in the middle by a function F(x) for which 
F(—x) = —F (x). In this case, he found the solution to be 


u(x,t) = 2 / 7 ( , = F(a) sin(qa)da) e*@"t sin(gx)dq. 
0 0 


T 


At the risk of being somewhat arbitrary, I add this remark by Fourier ($358): 


We might deduce also from the transformation of series into integrals the properties of the 
two expressions 


=f cos qxdq =f sin gxqdq 
a an 2? 
o it+@q mJo I+q 


the first (Art. 350) is equivalent to e~* when x is positive, and to e* when x is negative. The 
second is equivalent to e~* when x is positive, and to —e* when x is negative, so the two 
integrals have the same value, when x is positive, and have values of contrary sign when x 
is negative. 


T 


10.3. The Analysis of Fourier Integrals 


Fourier was as over-confident here as he had been when dealing with infinite series— 
or perhaps we should say that rising standards of rigour were to catch up with him. 
Rather than trace the history of analyses of this aspect of his work, I shall leap to a 
satisfactory solution from just over a century later by the American mathematician 
A.G. Webster that indicates what had to be done.!° 

We begin with the Fourier series for a function defined on the interval [—/, /]: 


1 CO A * 
f@)= gt Do (ay om 2 + by sin 7). 


!0See Webster ([265], 153-156). 
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where 


ae i jms Le . jms 
=F re b= T aa ia 


and we suppose that the function f is such that these last two integrals converge. 
The question is: what happens as 1 > 00? 

Note first that if f(s) is absolutely integrable on the whole real line then a; and 
b; both tend to zero as 1 > oo. 

We define 


j=m 


: 1 as 
sd.m,x) = | f(s) —5 +) cos —( — x) ds 


and consider its value as both / and m tend to infinity. This value depends on the 
order in which we take these limits. 

Now, a detailed argument (here omitted, see Webster [265], 154) leads to the 
conclusion that if the function f is continuous at x then 


f(x) = lim - [ (f° cos(t(s — sya) f(s)ds, 
POT J_oo \JO 


where p = mm/1. So we require that m/1 — oo, and so we let m — on first. 
It then seems natural to suppose that 


1s Pp 1% lo) 
lim ~ | (/ cos(t(s — pat) f(s)ds = -{ (/ cos(t(s — dr) f(s)ds. 
PT Joo \JO T J—co \J0 


Indeed, this is the form in which Fourier gave it. But this, to quote Webster (p. 155) 
“makes no sense” because the integral 


sin(p(s — x)) 
S—X 


i cos(t(s — x))dt = 
0 


does not tend to a limit as p — oo but instead oscillates. 

Instead, as Webster pointed out (see pp. 155-156), it is possible to switch the 
order of integration—I omit the argument—and to deduce that the correct value of 
the Fourier integral is 


: iC ([. f(s) cos(t(s — ) dt. (10.13) 
T JO —0o 


Webster at this point quoted Kronecker (Vorlesungen tiber die Theorie der 
einfachen und vielfachen Integrale, 81) to indicate the continuing importance of 
Fourier’s result: 
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This so-called Fourier double-integral made at its discovery a tremendous impression on the 
mathematical world. It was shown for the first time how an almost arbitrary function, satisfy- 
ing only the limitations mentioned, fits itself into mathematical forms. The formula (10.13) 
maintains its correctness, as was shown by P. du Bois-Reymond, for various fluctuating 
functions, inserted instead of the cosine. 


10.4 Stokes and Laplace Transform 


In the 1860s, as we shall see more fully in Chap. 20, Thomson and Stokes used the 
heat equation to successfully describe the transmission of electricity down a wire. In 
the course of that work, Stokes used a Fourier integral approach which we shall now 
examine.!! 

Stokes treated the problem as being one of heat diffusion down a semi-infinite 
wire (x > 0) with an arbitrary initial distribution of heat (or electricity) along with it. 
The case of heat (or electricity) concentrated at the end point x = 0 is of particular 
interest because it corresponds to an impulse at that end. 

He wrote the equation for the temperature v(x, ft) as 


dv &v 
Ot Ax?’ 


with the conditions that 
v(x, 0) =0, v0, t) = f(t). 


He then looked for a solution in the form 
[o.e) 
v(x, t) = 1 u(a, t) sinaxda. (10.14) 
0 
First, differentiation with respect to ¢ under the integral sign gives 


oO - 1 sinaxda. 


Ov [ Ou(a,t) , 
0 


But, he observed, differentiation under the integral sign with respect to x does not 
work, because v does not vanish when x = 0, and it is necessary to add the term 


2 2 
—av(0, t) = —af (t), 
TT TT 


lt is well worth consulting https://www.math.ubc.ca/~feldman/m267/pdeft.pdf for an account 
of how the Fourier transform can be applied to partial differential equations, including the wave 
equation and the telegraphist’s equation. The same method applied to the heat equation is described 
in http://web.math.ucsb.edu/~helena/teaching/math124b/heat.pdf . See also Appendix D. 
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as he had explained in a paper he had published earlier. 


It follows that 
oe aw a) 
ae — / (Faro _ ou) sinaxda. 
0 Tv 


Ox? 
Hence, 
” = - = i (“S Ds ~ aft + ou) sinaxda, 
and so 


auc. as “oft au. 


This is an ordinary differential equation whose solution was known. It is 
2 2 t ve 
u(a,t)= mee f af (tet dt’, 
T 


where the constant of integration has been expressed as an arbitrary lower end point 
of the integral. 
The initial condition v(x, 0) = 0 implies that 


2 ' 244! 

u(a,t) = =a f tye de. 
T Jo 

So Stokes wrote that the temperature v is given by 
2° f' 2444) 
v=— / i: f(t)aet® "™ sin axdadt'. 
7 Jo Jo 
Equation 10.14 implies that this means 


2 oe 2 : 24! 
v= — / Gaal af (ter! a) sin axdx. 
T Jo 


He now switched the order of integration, so that the first integral to be evaluated 
is the a integral, which is of the form 


[o.e) 
Bs, 
/ e “™ sin baada. 
0 


This is a derivative of a known integral: 


on 2 1 say 1/2 
/ e ““ cosbada = = (=) eee 
0 2 
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sO 


sa 2 dl n\}/2 2 
aa" cin hb da = —-—-— (=) —b° /4a 
i e sin DA@QaQ abe a e 


1/2 
= br"! —b?/4a 
4a3/2 


Whence, as Stokes said, writing t — t’ for a and x for b, 


ne ee Gi — 13 eae —1') ft) dt! 
, Qn1/2 0 . 


Three comments are in order. First, switching the order of integration is imper- 
missible, although the conclusion remains correct. 

Second, as we shall see in Chap. 20, the x? term is critical. 

Third, the known integral is an example of a Laplace transform, which we now 
proceed briefly to discuss. 

Ingenuity with integrals was a necessary skill of the eighteenth-century mathe- 
matician, and Euler and others discovered many clever results without being encum- 
bered by rigour. One of Laplace’s contributions in this line was the study of integrals 


of the form - 
i ee" f(®dt, 
0 


which are today called the Laplace transform of the function f(t). 

When the function f(t) is a power of ¢ the Laplace transform can be found 
recursively, starting with f(t) = t. When f(f) is an exponential it is elementary to 
find the transform, and in this way the Laplace transforms of the trigonometric and 
hyperbolic functions are found. More complicated integrals, such as the one Stokes 
used, require ingenuity and were the stock in trade of mathematicians of the day (as 
they still are for many kinds of engineer). 

The Laplace integral 

o.e) 

/ e* dx = JT, 


(oe) 


which implies that 


oo y 
/ e “dx =J/n/a, 


[oe] 


is one of the pleasures of contour integration in elementary complex function theory. 
I leave that for you to find, but this anecdote about it is irresistible. !” 


'2From Thompson [254], 1139 quoted in Liitzen [192], 146. The author here is Sylvanus P. Thomp- 
son, in his biography of Thomson. 
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When Thomson was young he went to Paris where he formed a high opinion 


of Joseph Liouville. He asked him about this integral, and Liouville immediately 
evaluated it. This mightily impressed Thomson, and 


Once when lecturing he [Thomson] used the word “mathematician” and then interrupting 
himself asked the class: “Do you know what a mathematician is?”. Stepping to the blackboard 
he wrote upon it: 
ee 2 
Hl e dx = A/n. 
—0o 

Then, putting his finger on what he had written, he turned to his class and said: “A mathe- 
matician is one to whom that is as obvious as that twice two makes four is to you. Liouville 
was a mathematician.” Then he resumed his lecture. 


10.5 Exercises 


. Find Fourier series expansions of some very simple functions, such as f(x) = 1 


and f(x) = x ona suitable interval. 
Confirm Liouville’s claim that 


% 2 
i e dx = Jn. 
—0o 


Fourier’s first method for finding Fourier coefficients invoked Wallis’s series for 
7 as an infinite product. Find out what this is and how Fourier used it. 


Questions 


1. 


Euler said that the heat equation could only be solved with great effort. Fourier 
tackled it with methods drawn from the theory of the vibrating string. What does 
that say about methods for solving partial differential equations around 1800? 

The eighteenth-century debate about arbitrary solutions to the wave equation and 
their representation by infinite series of sines and cosines was inconclusive, but 
the nineteenth-century debate was hugely productive. Why do you think this was? 


Chapter 11 Mm) 
Gauss and the Hypergeometric Equation =a" 


11.1 Introduction 


The hypergeometric equation is arguably the richest example of a linear ordinary 
differential equation with polynomial functions as coefficients. It has deep roots in 
the study of elliptic integrals, and its study throughout the nineteenth century was to 
be promoted by Gauss, Riemann, and others.! 


11.2 Elliptic Integrals 


The simplest, paradigmatic, elliptic integral is 


dt 
u / a (11.1) 
It measures arc length along the lemniscate r? = cos 20, which is a curve in the shape 
of a figure eight. 

The Italian mathematician Count Fagnano had found some interesting results 
about this integral in 1714, and after Fagnano submitted his life’s work, the Produzioni 
[101] to the Berlin Academy they were sent to Euler, who greatly extended them in 
the early 1750s.” Even so, it was becoming an embarrassment that the integral could 


not be expanded as a function of its upper endpoint, except as a power series that 
revealed no significant properties of the integral. 


'For a much fuller account, see Gray [124] and Bottazzini and Gray [22]. 
>The first paper, E252, was presented to the Berlin Academy in January 1752, and the second, E251, 
was presented to the St. Petersburg Academy in April 1753. Both were published for the first time 
in 1761, which says something about the turbulent conditions of the time. 
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Euler’s work on many topics was a great stimulus to Adrien-Marie Legendre, 
and he took up the challenge of including elliptic integrals in an extended theory of 
functions. He concentrated on the integrals 


x 


r dt 
re= fF and BG) = | adr, 
0 


0 


where A = \/(1 — t2)(1 — c?#?), which he called integrals of the first and second 


kinds, respectively. He called the parameter c the modulus and required it to be real. 
1 


t 1 
He was also interested in the corresponding complete integrals { — and f Adrt, 
0 0 


which he denoted F'! and E', respectively, and as F!(c) and E!(c) when he wanted 
to think of them as functions of the modulus. 

Legendre’s three-volume Exercises de calcul intégral [184] draws together his 
life’s work on the subject. Among the many results it contains was one that showed 
that the complete elliptic integrals satisfy linear differential equations as functions 


of the modulus?: 
@F! 1-3? dF! 


(1 —c’?) fot = F'=0 (11.2) 
@E! 1—c*dE! 
aC) ge a (11.3) 


He solved these equations by the method of undetermined coefficients and so obtained 
power series expansions for the complete integrals F!(c) and E!(c). He also estab- 
lished a strikingly attractive result connecting complete integrals of the first two kinds 
with complementary moduli (c and b = V1 — c?): 


. = F'(c)E'(b) + F'(b)E!(c) — F'(b) F' (0). (11.4) 


Legendre was keen to show that his new functions would be useful. He discussed at 
length how to calculate table of values for them, and then investigated three problems 
in detail: the rotation of a solid about a fixed point; the motion (either in a plane or 
in space) of a body attracted to two fixed bodies; and the gravitational attraction 
due to an homogeneous ellipsoid. In the first volume of his Traité (1828), he further 
investigated motion under central forces, the surface area of oblique cones, the surface 
area of ellipsoids, and the problem of determining geodesics on an ellipsoid. But for 
all this work, these integrals did not reveal their most fundamental properties to him, 
as we shall now see. 


3It is sometimes said that Euler studied the first of these in 1750, supposedly in the paper E154, 
which is indeed about the rectification of the ellipse. But this is not quite true; there Euler studied 


— : . : ; a l+p? d 
the similar but different differential equation (1 Dp?) ap a an + gq =0. 
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Fig. 11.1 Carl Friedrich 
Gauss (1777-1855) by 
Christian Albrecht Jensen, 
1840 


11.3. Gauss 


Gauss (Fig. 11.1) was arguably the first mathematician to leave the circumscribed 
eighteenth-century domain of functions given by explicit expressions and moves with 
ease into the large class of functions known only indirectly through some prescribed 
property. 

This move confronts all who take it with the question: when is a function “known”? 
One answer is to develop a theory of functions in terms of some characteristic traits 
that can be used to mark certain functions out as having particular properties: they 
are periodic, or they have no zeros, for example. A second answer sidesteps the 
question and regards the inter-relation of functions given in power series as itself 
the answer, which is what Gauss did in his study of the hypergeometric series. Most 
mathematicians adopted a mixture of the two approaches depending on their own 
success with a given problem. Later, Weierstrass and his followers in the Berlin school 
based their theory of functions on the study of series. On the other hand, Riemann 
and later workers, chiefly Klein and Poincaré, sought more geometric answers. 

Gauss was a brilliantly gifted mathematician born at an unusual time. In 1801, the 
year Gauss became famous at the age of 24, Lagrange was 64, Laplace 51, Legendre 
48, and Monge 54. Contact with them, and the younger generation of Cauchy and 
Fourier, would have been difficult for Gauss because of the Napoleonic war, and 
perhaps distasteful, given his conservative disposition. His teachers, Pfaff (then 35) 
and Kaestner (81), were not of the first rank, and nor were his contemporaries Bartels 
and Farkas Bolyai. By the time the next generation of young mathematicians emerged 
(Jacobi and Abel, for example) Gauss had become confirmed in a lifelong avoidance 
of mathematicians, and was closer to German astronomers, notably Bessel, in whose 
subject he worked increasingly. 
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Gauss left several of best discoveries unpublished, and they only became known 
with the publication of his collected works after his death in 1855. His sympathy for 
the work of Janos Bolyai and Lobachevskii, when it was revealed, helped change atti- 
tudes to non-Euclidean geometry, and his work on elliptic functions and his work on 
the hypergeometric equation and the hypergeometric serieswere also revelatory. To 
introduce it, we must make a digression and consider what is called the arithmetico- 
geometric mean (agm). 

Gauss discovered the agm for himself when he was 15. It is defined as follows 
for positive numbers ag and bo. Set a; = + (ao + bo), their arithmetic mean, and 
b; = Vaobo, their geometric mean. The iteration of this process, defining 


1 
an4l = 5 an + by) and bay =v Andy, 


produces two sequences (a,,) and (b,,) that converge to the same limit, a, called the agm 
of ao and bo. Convergence follows from the inequality dy41 — Dn4i < 5 (an — by). 

Gauss denoted the agm of a and b by M(a, b). Plainly M(Aa, Ab) = A1M(a, b) 
and Gauss considered various functions of the form M (1, x). For example, 


M(,1+x)=M(+ _ JT). 


so setting x = 2 + t? he obtained power series expansions with undetermined coef- 
ficients for M in terms of x and then in terms of ft, from which the coefficients 
could be calculated. They display no particular pattern, but various manipulations 
led Gauss to this dramatic series for a reciprocal of M: 

25 6 
256 


ita isi? 1.3.5\7 
1 é 2 aoe 4 fe) 6 
+(5) . +(3) 7 + (333) oe 


As a function of x, y satisfies the differential equation 


1 9 
=M(1+x,1—x)'=1 : 
y (+x x) + 9% Tegner 


d? d 
(x3 =2) 5 4+ Bx? — p> 4+ xy =0, 


which is Legendre’s equation (11.2). Gauss also found another, linearly independent, 
solution M(1, x)7!. 

The substitution x? = z turns the equation for M(1 + x, 1 — x)~! into this exam- 
ple of the hypergeometric equation 


d’y dy 1 
1-2 
ae ek 


x(1 — x) 


as we shall now explain. It is another form of Legendre’s equation. 
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11.3.1 The Hypergeometric Equation 


Gauss published only his study of the hypergeometric series, [114] in 1812. The 
second part, on the hypergeometric equation [115], which is the differential equation 
satisfied by the hypergeometric series, was found among the extensive Nachlass and 
follows on from the first in numbered paragraphs (Sects. 38-57). 

The published paper is not remarkable by Gauss’s standards, although it con- 
siders x as a complex variable, and contains the earliest rigorous argument for the 
convergence of a power series and a study of the behaviour of the function at a point 
on the boundary of the circle of convergence, as well as a thorough examination 
of continued fraction expansions for certain quotients of hypergeometric functions. 
Part two is given over to finding several solutions of the hypergeometric equation 
and the relationships between them, and is of more interest to us here. 

In the first part, Gauss observed that the series 


_ ap a(a+1)B(B+1) , 7 
Se ogy 


ae 


where a, 6, and y are real numbers, is a polynomial if either a — 1 or B— lisa 
negative integer, and is not defined at all if y is a negative integer or zero (this case 
he excluded). In all other cases, the ratio test shows that the series is convergent for 
x =a + bi whenever a? + b* < 1. It is striking that Gauss was willing to introduce 
a new function as a complex-valued functions of a complex variable. 

He gave, following Pfaff [209], a list of functions which can be represented by 
means of hypergeometric functions. For example, 


t 


ra 


e’ = lim F(1,k, 1, 
k-> oo 

the trigonometric functions can now be obtained. Gauss then introduced the idea of 
contiguous functions (Sect. 1, Sect. 7): F(a, B, y, x) is contiguous to any of the six 
functions F(a +1, 68 +1,y +1,x) obtained from it by increasing or decreasing 
one coefficient by 1. He obtained 15 equations connecting F(a, B, y, x) with each 
of the 15 pairs of its different contiguous functions by systematically permuting the 
as, Bs, ys, etc. and comparing coefficients. As an illustration, here is the first of these 
equations: 


(y — 2a — (B —a)x)F(a, B, y,x) +a —x)F(a+1, B,y,x)-(y-—a)F(a- 1, B,y,x) =0. 


As Felix Klein was to remark ([157], 16) these establish that any three contiguous 
functions satisfy a linear relationship with rational functions for coefficients. As a 
result, there are linear relationships over the rational functions between any three 
functions of the form F(a +m, B +n, y + p, x), where m, n, and p are integers. 
Gauss then showed that how to use contiguous functions to provide continued fraction 
expansions of quotients of hypergeometric functions, e.g. 
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F@,p+1l,y+1,x) 


Fa, B, y,x) 
and hence for several familiar elementary functions. Observe, as Gauss did at the 
dF(a, B, y, . : 
start of the second paper, that F(a, B, y, x) and PENCE oS) are contiguous in 


x 
the obvious generalised sense, the relationship between them being, essentially, the 
differential equation itself. This is because 


d 
© et Hy ay PAA BS Vp ia): 
dx y 


In the third and final sections of the published paper, Gauss considered the question 
of the value of F(a, 8, y, 1) , i.e. of lim,.; F(a, 8, y, x) , for real a, 6B, y. He then 


defined 
1.2....k.k5 


(z+ D(z +2)...(2 +k)’ 


I1(k, Z) —— 


where k is a positive integer, and I(z) = limgz_,.. II(k, z), which may be called 
(Gauss’s) factorial function and is his version of the Gamma function. The limit 
certainly exists for Re(z) > 0 and [] satisfies the functional equation T(z + 1) = 
(z + I) M(z) with T1(0) = 1, from which it follows that I(1) = n! for positive inte- 
gral n. I is infinite at all negative integers.’ The factorial function enabled Gauss to 
obtain many results that earlier mathematicians had obtained only with great effort. 
As Gauss puts it: “Whence many relations, which the illustrious Euler could only 
get with difficulty, fall out at once”. 

Gauss began the second and unpublished part of the paper, “Determinatio series 
nostrae per Aequationem Differentialem Secundi Ordinis”, by observing that P = 
F(a, B, y, x) is a solution of the hypergeometric equation: 

dw 


—_ 
a= 2) dx? 


dw 
+(y—(@@+ B+ Ix) — afw =0. 

dx 
To find a second linearly independent solution he set | — y = x, when the equation 
becomes the first equation with y replaced by a+ 6 +1-— y. It, therefore, has a 
solution F(a, B,a+B+1-—y,1—-x), and the differential equation, in general, 
has solutions of the form 


MF (a, B,y,x)+NF(a, B,a+B+1—y,1—x), (11.5) 
where M and WN are constants. 


Other solutions may arise which do not at first appear to be of this type, but, 
he remarked, any three solutions must satisfy a linear relationship with constant 


4Tn the usual notation from Legendre ([184], Vol. ID, M(z) =T(z+ 1). 
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coefficients. This fact was of most use to him when transforming the differential 
equation by means of a change of variable. 

The substitutions he considered are of two types: the following transformations 
of x: 


1 
x=l-y,x=-,x= C= F 
y 


and these transformations of P: 
P=x"P’, P=(1—x)'P' 


for particular values of jz. These gave him several solutions to the original equation 
in terms of functions like F(—, —,—, x) and F(—, —, —, | — x) etc.—where the 
blanks stand for expressions in a, 6, and y—possibly multiplied by powers of x and 
1 — x, and also some linear identities between triples of such solutions. 

The paper concludes with a discussion of certain special cases that can arise when 
a, 6, and y are not independent, for example, when 6 = a + | — y, and the quadratic 
change of variable x = 4y — y? can be made. 

Gauss made a very interesting observation at this point. The equation has as one 
solution in this case: 


1 1 
F(a, B,a+B +5, 4y —4y*) = FQa, 2B,0+ B+ 5,9): 


If, he said, y is replaced by 1 — y this produces 


1 1 
F(a, B,a+ B+ 5,4y —4y*) = FQa, 28,a+B+5,1—)), 


as we can see by looking at the basis exhibited in Eq. (11.5) above, and we are led 
to the seeming paradox 


1 1 
BCa 28 Oe Boe = FOB ae Bor ody): 


“which equation is certainly false” (Sect. 55). 

To resolve the paradox he distinguished between the sign F when it stood for a 
function that satisfies the hypergeometric equation, and when the sign F stood for 
the sum of an infinite series. The sum is only defined within its circle of convergence, 
but the function is to be understood for all values of its fourth term that have been 
obtained by continuous change, whether real or imaginary, provided the values 0 and 
1 are avoided. However, this “function” may be many-valued, and it is in this case. 

This being so, he argued that one would no more be misled than one would infer 
from arcsin 5 = 30° and arcsin 5 = 150° that 30° = 150°, the reason being that a 
(many-valued) function such as arcsin may have different values even though its 
variable has taken the same value, whereas a series may not. 
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Gauss here confronted the question of analytically continuing a function outside 
its circle of convergence. It was his view that the solutions of the differential equation 
exist everywhere but at 0, 1, (and oo, although he avoided the expression), whereas 
their representation in power series is a local question. However, if the function is a 
many-valued function then the series expression may not be recaptured if the variable 
is taken continuously along some path and restored to its original value, and neglect 
of this fact can lead to absurd expressions like the one Gauss produced. 

Because he here talked of continuous change in the variable in the complex number 
plane, one may thus infer that Gauss here was truly discussing analytic continuation, 
and not merely the plurality of series solutions at a given point. 

In these papers, Gauss introduced a large class of functions of a complex vari- 
able that were defined by the hypergeometric equation and were capable of various 
expressions in series. The main direction of his research was in studying relation- 
ships between the series, which in turn provided information about the nature of the 
functions under consideration. 

The second part of Gauss’s paper on the hypergeometric series raises two main 
types of question. First, it would be useful to have a systematic account of the 
solutions obtained by the various substitutions, and of the nature of the substitutions 
themselves. Second, it would be instructive to connect the hypergeometric functions 
with the newer functions in analysis, especially in complex analysis, such as the 
elliptic functions. It is striking that Kummer in his paper [167] set himself both 
these tasks and resolved them while, moreover, observing Gauss’s restrictions where 
the work would otherwise be too difficult (for example, by considering only real 
coefficients). 


11.4 Kummer and His 24 Solutions 


Ernst Eduard Kummer had studied Mathematics at the University of Halle, and in 
1836, when he published his paper on the hypergeometric equation, he was 26 and 
a Lecturer at the Liegnitz Gymnasium. Although he never attended a lecture by 
Dirichlet, he considered him to have been his real teacher, which is an indication of 
Dirichlet’s great influence on mathematics in Germany, an influence that was then 
extended to Kummer’s best student at Liegnitz, Leopold Kronecker. Kummer, with 
Kronecker and Weierstrass, went on to dominate the Berlin school of mathematics 
from 1856 until Kummer retired in 1883. His students regarded him as a gifted 
teacher and organiser of seminars, and he was diligent in his concern for them. He 
was also a man of great charm, and he had a great appetite for administration, being 
Dean of the University of Berlin twice, Rector once, and Perpetual Secretary of the 
physics-mathematics section of the Berlin Academy from 1863 to 1878. 

At the start of his long paper (1836), Kummer remarked of Gauss’s paper that 

But this work is only the first part of a greater work as yet unpublished, and wants comparison 


of hypergeometric series in which the last element x is different. This will therefore be the 
principal purpose of the present work; the numerical application of the discovered formulae 
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will preferably be made to elliptic transcendents, to which in great part the general series 
corresponds. 


The hypergeometric equation has three singular points, x = 0,x = 1,x =~, 
and in a neighbourhood of these points one expects the solution to be of the form of 
a hypergeometric series in either x, | — x, or 1/x, respectively, possibly multiplied 
by some power of x, 1 — x, or 1/x. This was one reason for Kummer to look for 
appropriate changes of variables in the study of the hypergeometric equation. Another 
was the need to find hypergeometric series that yield independent solutions of the 
hypergeometric equation other than the canonical hypergeometric series, because 
the differential equation has a basis of two independent solutions. 

Kummer investigated what happens to the hypergeometric equation under changes 
of variable, and found that the only changes of variable that can be made (unless 
there are special relations between the coefficients a, 6, y) are the ones that Gauss 
considered—including the identity transformation x = z and the one Gauss had not 
written down, x = 1/(1 — z). 

The details of his argument allowed him to deduce more. For example, if a is 
replaced by y — @ and f is replaced by y — f this produces a new solution: 


= dd = aye Fy -—ay- B, Y; x). 
Other similar changes produce the solutions 


yy aa Fy —a,y — B, y,x), 


and 
yee Fay PF =o, 1 = 8,2 =—y,2); 


These solutions can also be checked directly, by long but straightforward calculations. 

In this way, he was led to his finest achievement in this paper (Kummer 1836 
[167], 52-53): an enumeration of a family of 24 solutions to the hypergeometric 
equation that between them form what can be considered as the complete solution to 
the equation (see Fig. 11.2).° In Kummer’s work, the variable is real, and he regarded 
the 24 solutions as the best way to represent solutions valid near the singular points 
in a variety of convenient forms. Later writers, Riemann and Schwarz, allowed the 
variable to be complex, in which case Kummer’s 24 solutions provide not only sets of 
bases for the solutions everywhere, but their inter-relations (which he also described) 
give a description of their analytic continuation on the complex sphere. However, 
this interpretation was almost completely lacking in Kummer’s work (despite the 
remarks of some later mathematicians and historians such as Klein ({158], 267) and 
Biermann ([16], 523)).° 


5See also his Collected Papers Vol. Il, 88, 89. 


©Kummer concluded the paper with an unremarkable study of what happens when x is allowed to 
be complex but a, f, and y stay real that has no bearing on the issue of analytic continuation. 
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We could check these solutions by introducing a new variable, say z = x/(x — 1), 
writing the hypergeometric equation in terms of z, and plugging in the new solution. 
If we do so, it is important to note that the transformed equation may no longer look 
like the hypergeometric equation. 

As you can see in the table he provided, Kummer expressed the solutions of the 
hypergeometric equation in terms of a hypergeometric series possibly multiplied by 
a power of x or | — x, where the variable in the hypergeometric series is one of 


1 1 x x—-1 
ea ; , or ; 
x l-x x-1 x 


The various solutions are valid on several different domains of continuity, which for 
Kummer were intervals on the real line separated by the points x = 0 and x = 1 and 
which Gauss would have understood as discs and half-planes in the complex plane. 

If we let x be a complex variable, then it is easy to see that, because F(a, 6, y, x) 
converges for |x| < 1, which is the disc Do centre 0 and radius 1, the series in the 


variables 
1 1 x x—1 
1l-—x, -, : , and 
x l-x x-1 x 


converge, respectively, in the domains 


|1 — x| < 1, which is the disc D, disc centre | and radius 1; 
outside the disc Do; 

outside the disc D,; 

in the half-plane of complex numbers with real part less than 5; 

in the half-plane of complex numbers with real part greater than 5. 


Because some of these domains overlap, Kummer also established the linear 
relations that exist between any three of the 24 solutions that converge on a common 
neighbourhood. Some are simple equalities, for example, between expressions (1), 
(2), (17), and (18) — (see [167], 54-55) 


F(a, B, y, x) = (1—x)""* PF(y —a, y — By, x) 
= (1—x) “F(a, y — By, x/( — 1). 
In fact, there are six different families of four equal solutions thus: 
1,2, 17, 18; 3,4, 19, 20; 5,6, 21,22; 7, 8,23, 24; 9,12, 13,15; and 10, 11, 14, 16. 


So, to find all the linear relations between the 24 solutions is enough to consider the 
six different ones 1, 3, 5, 7, 13, 14. Of these, 5 and 7 converge or diverge exactly 
when 13 and 14 diverge or converge, respectively (Kummer here restricted x to be 
real, but the observation is valid for complex x). The problem is thus reduced to 
finding the relations between the following triples: 
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1,3,5; 1,3,7; 1,3, 13; 1,3, 14; 1,5, 7, 


and after some work he listed the relationship that arise. They all arise from evaluating 
F(-—,—-,—-,x) atx =Oorx =1. 

What Kummer’s solutions also show, on letting x be complex, is that a solution 
of the hypergeometric equation is analytic everywhere except at the three points 
0, 1, oo and that in a neighbourhood of any of those points the solution becomes 
analytic on being multiplied by a suitable power of x or | — x. In the language of 
later writers, this says that the hypergeometric equation is an ordinary differential 
equation with three regular singular points. 

We shall see in Chap. 14 how Riemann deepened that insight. 


11.5 The Method of Undetermined Coefficients 


We can solve the hypergeometric equation 

(a = x°)y"(x) + (C— @ +b + Ix)y'(x) — aby = 0 
in a neighbourhood of the origin by the method of undetermined coefficients, and in 
this way realise that the solutions of the hypergeometric equation that are displayed 


in Fig. 11.2 are correct for various changes of variable. 
To do this, we write 


y = x¥ (ag + ax + agx® +++ tb agx™ +--+) =x* f(x), ao £0. 


This gives 
yl (x) = kx" ag + ay + aax? +--+ tb anx” +---) 4x8 qq tara te -tangix" +---), 
y" (x) = kk — W)x®&? (ag + ayx Fanx? +--+ tanx” +--+) + 2kxk May tax t+ --- + angix" +--+) 
xk (Qa, + 6a3x +--+ + (n+ 2)(n + Danyrx" ++°+ 
So 


(x - x) y" (x) +(e—(a+b+1)x)y' (x) — aby = k(k 1x*7! : ckx*7! + higherterms, 


sO 
k(k — 1) +ck =0, 


and so 
k=0 or kK=1-c. 
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Fig. 11.2. Kummer’s 24 1) 
solutions, from Kummer 2) 
({167], 52-53) 3) 

4) 


16) 


20) 
21) 


22) 
23) 
24) 


11 Gauss and the Hypergeometric Equation 


F(a, B, ¥, x), 

(1—2x)r*-A F'(y—a, y—B, 2), 

#7 Fa—y+t, B—y+1, 2—%/, x), 
a7 (1— x)? F(i—a, 1—B, 2—, x), 


F(@, B, a+8—y+1, 1—2), 
et F(a—y+1, B—y +1, a+8—y +1, 1-2), 
(1—zy"F F(y—a, y—B, y—2—B +1, 1—2), 
a'7 (L—x)/-*-8 F(i—a, 1—B, y—a—P-+1, i—~x), 
a*F(a,a—y+1, a—@+1, +), 
=*F(8, B—y+1,8—a+1, +), 
21(1—ay-- F(t—a, y—a, B—a +1, +), 
a-1(1—2y-*# F(1—B, y—B, a-B-+1, =), 
(—a2)* F(a, y—B, a—B+1, 4), 
(i—2)* F(8, y—a, B—a+1, ~~), 
2-1(1—2y- F(a—y+H1, 12, a—B+1, -*+), 
a(1—ay-FF(B—y +1, 1a, B—a-+1, ~*~), 
(i—2)"F (a, y—B, » =) 

“ay F(8, ya 1, =25)s 
aV(1—a) F(a—y 41, 14—8, 2-7, 5), 
11a F(B—y +1, 1a, 2-7, ), 
2 F(a, a—y-+1, a+fB—y +1, =), 


a F(8, B—y+1, o+8—y +1, —), 
a1(1—ay F(t & Y—d, y—a—B-+4, =), 
a!-1(1—a)*-* F(1—B, y—B, y—a—841,=—). 


In the case when k = 0, we find by looking at the coefficient of x that —abag + 


ab er 
ca, = 0, so aj = —apo. Likewise 
Cc 


az = 


and by looking carefully at the co 


(-—ab —n(a+b+1)—- 


, on tidying up, we find that 


— @t+)O+) 
A%et+1) ’ 


efficient of x” that 


n(n — 1))a, + (n+ I(e+n)an+) = 0. 
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So 
(ab+na+nb+ n)dn = (n+ 1(c+n)dn41, 


and so 
_ @+nb+n) 


Qn. = ————&y. 
1 n+ D(e+n) 


So, in this case, the solution of the hypergeometric equation is the hypergeometric 
series 


_ ab a(a+Dbb+1)\ 5 
va) =a (14 + 1.2.cle+1) Ja 


It is an interesting exercise to determine the domains of convergence of these 
series, which relates to the changes of variable used by Gauss, Kummer, and later 
Riemann. The contiguous equations mentioned by Gauss can be used to find a second, 
linearly independent solution to the hypergeometric equation and convergent on the 
same domain, as Gauss indicated. 


11.6 Exercises 


1. Derive Eq. (11.2) 


,@°F!  1—3c? dF! 
Cc 


1 
( dc? c dc 


and (11.3) 
+E! =0. 


,@E! 1—c?dE!} 
Cc) + 


1 
( dc cdc 


2. Calculate the arithmetico-geometric means of several pairs of numbers, including 
this one that Gauss did, the arithmetico-geometric mean of | and 2. 


Questions 


1. Find out what Gauss found significant about the arithmetico-geometric mean of 
land /2 (it is nr. 98 in his mathematical diary). 

2. Why do you think Gauss published his investigations into the hypergeometric 
series but not into the hypergeometric equation? What other subjects did Gauss 
investigate but not publish? 


Chapter 12 M®) 
Existence Theorems ick 


12.1 Introduction 


It was surely inevitable that the confident methods of the eighteenth century for solv- 
ing ordinary differential equations, which were more formal than rigorous, would be 
critically examined by Augustin Louis Cauchy (Fig. 12.1), and they were. Curiously, 
however, although he appreciated the difference between real and complex methods 
he chose to advertise his complex power series methods and seemingly forgot that 
he had been the first to give a good account of ordinary differential equations in line 
with the insights of his own ¢ — 6 analysis. 

Cauchy also built on the work of Monge and Lagrange to provide existence the- 
orems for partial differential equations, but in this case his use of real analysis for 
the first-order case and analytic functions for the general case reflects a fundamental 
difference in what could be established. This is discussed in Chap. 17. 


12.2 Cauchy and Ordinary Differential Equations 


Cauchy’s interest in analysis was not confined to providing improved foundations for 
the subject. He applied it in many domains of mathematics and in so doing greatly 
extended it. As one example of this, we discuss his work on the topic of showing 
that ordinary differential equations have solutions. 

He tackled this question twice in his life. The first time was as early as 1820 or 
1821, when he was lecturing at the Ecole Polytechnique, and the second some 15 
years later, when he was in self-imposed exile in Prague. As we shall see, for many 
years the second method eclipsed the first, for reasons that tell us a lot about the 
mathematical community of the time and its priorities. 

In 1820, advanced mathematics students learned how to solve differential equa- 
tions from Lacroix’s Traité du calcul différentiel et du calcul intégral. This was a 
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Fig. 12.1 Augustin Louis 
Cauchy (1789-1856), artist 
unknown, after J. Rollet, 
1840 


deliberate ragbag of methods. Cauchy broke with this approach, and with the views 
of the Directorship of the Ecole, by tacking a question that Lacroix had ignored: 
Does a first-order ordinary differential equation have a solution? 

It might seem surprising that Lacroix had not dealt with this question himself, 
but there are several reasons why he did not. Insofar as differential equations arise 
in mathematical physics, it can seem obvious that they have solutions—the only 
problem is how to find them. Lacroix was not much interested in rigour, and it is 
always possible to check that a suggested solution is correct. Even the theoretical 
formulation of a differential equation, as a curve about which some information about 
the behaviour of the tangents is given (or a function about which some information 
about the behaviour of its derivatives is given) seems to imply that there is a function 
that has the prescribed property. 

Butrigour in these matters greatly concerned Cauchy. Christian Gilain, the modern 
editor of Cauchy’s work in the 1820s on the topic, commented that Cauchy’s existence 
theorem is not a discovery inserted in an otherwise classical text, but ({39], xxi) 


it has a very important place not only in the great number of pages devoted to the proof and 
its examples, but in its role in the whole organisation of the course. It is truly a work in the 
foundations in the general theory of differential equations ... 


The lecture notes of a course on ordinary differential equations that Cauchy gave 
were lost, and were not republished in the 31 volumes of his Oeuvres. However, a 
set of 13 lectures, in the original edition printed by the Ecole Polytechnique, was 
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found by the Gilain in the Bibliothéque Nationale in Paris and published in 1981. 
Unfortunately, it is apparently impossible to date them precisely; Cauchy seems to 
have had the idea by 1820 or 1821, and probably gave the lectures in 1823 or 1824. 

In his opening five lectures, Cauchy largely followed tradition and dealt with 
explicit solution methods. Lecture 6 prepared the way for Lecture 7, in which Cauchy 
observed that explicit solutions to a differential equation of the form 


d 
* = f(x,y) (12.1) 


dx 
yield a family of solutions, so one could also find the solution that took a particular 
value yo at a particular point xo. 

Cauchy now proposed that the important problem was to establish for every differ- 
ential equation the existence of a solution that took a particular value yo at a particular 
point xo. The existence of a general solution would then follow by allowing x9 and 
yo to vary. This is not a pedantic point. Lacroix, like others before him, had regarded 
a solution as a formula and the methods for finding a solution were largely formal— 
that is what was meant by having a general solution. Cauchy inverted the process, 
and asked if there is any solution at all that would have the required properties in the 
light of his revised standards for analysis as a whole. 

In Lecture 7, he sketched a method for proving the existence of solutions to the 
differential equation 


d 
= = f(y). 


For the first time, we get restrictions on the function f{—without which one suspects 
nothing can be proved. He showed that 


Theorem 12.1 [f the function f is continuous and bounded by +A as a function 
of x and y on a neighbourhood of the point (xo, yo) in the plane, and the partial 


derivative 55 is also continuous and bounded in that neighbourhood, then there is 


y 
a solution to the equation. 


He argued the solution function is very close to the collection of points (x;, y;) 
given by 
ee cer ey 


when the points x; are close together. This is what you would expect, because when 
Xx; — X;~1 is small the quotient 

Yj — Yi-1 

X jo xX j-l 
should be very close to the value of on and so to the function f at the point 
(x;-1, yj-1). Then the stated conditions imply, by a limiting argument, that as the 
points x; get closer and closer together, the points (x;, y;) lie on a curve that is the 
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graph of the integral [ f(x, y)dx, which is the solution of the differential equation 
and for which y(x9) = yo, so the theorem is proved. 

To be more precise, let us follow Cauchy and fix some notation. In Lecture 7, 
Cauchy considered the ordinary differential equation 


d 
* = f(x,y) 


dx 
for x € [xo, X] on the assumptions that f is continuous and bounded, | f(x, y)| < A, 
and f, is continuous and bounded, | f(x, y)| < C. He supposed the initial value of 
y was yo and he set X — x9 = H.. 
He took a sequence of points x9 < x1, <<... < xX, = X and defined a sequence 
of points yo, y1,..-, ¥y = Y by the equations 


yi — Yo = (%1 — X0) fo; Yo), 
yj — ja = (4 — Xj—-1) fF @%j-1, Yj-1), 
Y = yn-1 = (X — Xn-1) f An-1, Yn-1)- 
He then took it for granted (validly) that 
Y = F(x0, %1,.-+,Xj,-+-Xn-1, X, Yo) 
is a continuous function of all its variables, and proved that as a result 
Y = yt (&« — x0) f(xo + OF, yo + OAH) (12.2) 


for some 0 < 6 < 1, andO < © < 1. (He did not use the continuity of f, here.) 
The proof has three small steps. First, summing up the equations above gives 

Y — yo on the left-hand side, and on the right-hand side a term that can be rewritten 

as X — Xo times some average value of the various f(xj;_1, y;—1) that Cauchy wrote 

as 0A by what we would call the intermediate value theorem, for some 0 < 6 < 1. 
Second, similarly, for every /, 


lyj —yj-1l < (X-x)A=HA 


and so may be replaced by an average, yo + OHA, for some 0 < 6 < 1. 

Third, therefore, the various values of the f(xj;~1, yj—-1) are all of the form 
f (xo + 8H, yo + OAH). A further averaging argument (or, we would say, use of 
the intermediate value theorem) gave the required result. 

Cauchy then investigated the dependence of Y on the initial value yo, and now 
he needed the conditions on /,. He proved the theorem that if the initial point is yj, 
where y4 — yo = fo, then 
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IY (9) — ¥(v0)| = OlBolee” 


for some ©, 0 < © < 1. This says that the solution (if it exists) is unique. 
The proof began by looking at y; and defining 6, by the equation y; — y; = fi. 
This gives 
|B1| — [Bol = (x1 — x0)(f Qo, Yo) — fo, Yo))- 


The term in f can be written (again by the intermediate value theorem) as 


fy (Xo, Yo + 90(9% — Yo))(% — Yo) = |Bol fy xo, Yo + 90(% — Yo)), 


for some 0 < 4 < 1, so 


1811 = |Bol(l + (1 — x0) fy Xo, Yo + 90(% — Yo))- 


By a further use of the intermediate value theorem this becomes 
|Bi| = [Boll + G1 — x0) @0C), 


for (yet another) 0 < @o < 1. A similar argument applies at every step. 
To combine the equations that result, Cauchy noted that 


1+ O©oC(xy — x0) < 1+ C(x — x0) < el 1-40) 
Therefore 
[Bal [Bole Pel Oe ase Oa) [ole 0) = Bale 


and the result follows. 

Cauchy then checked that the theorems are true independently of the choice of 
the x;, and could then state the theorem that if y = F(x) is the limiting value of 
the function Y = F (x0, X1,...,Xj,-.-Xn—1, X, yo) a8 n increases indefinitely and 
the distances x; — xj; decrease indefinitely, then y = y(x) satisfies the differential 
equation ay = f(x, y) with the initial condition y(xo) = yo. 

The proof is little more than using the intermediate value theorem to show that y 
is a continuous function of x, to bound the variation in y as a function of x, and to 
deduce that in the limit as the intervals between (otherwise arbitrary) x-values tend 
to zero the function y is a differentiable function of x and a = f(x, y) inthe region 
considered. 

In Lecture 8, he strengthened the theorem to show that the stated conditions on 
f and at imply that there is a solution to the differential equation on some interval 
Xo <x <a. He showed that the differential equation has a solution for values of x 
between x9 and x9 + a and values of y between yo — Aa and yo + Aa, where a is 
a quantity determined by the function f and A is greater than the absolute value of 
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Ff (x, y) in the interval considered (so A depends on a). Curiously, Cauchy did not 
prove that the constant a always has a non-zero value but only assumed it. 

As in Lecture 7, Cauchy took a sequence of points x9, x), ...,X, and the corre- 
sponding points yo, yj,..., Yn Such that 


Vii — Vj = jt — x) FO; ys), 
and for all j, |yj41 — y;| < Aa. Adding these equations gave him 


m—1 


Ym — Yo = >) (xj — x Fj, Ys 


j=0 


and the term on the right-hand side is equal to (x, — xo) f*, where f* is an “average” 
of the f(x;, y;)s. By assumption, for every j, 


If@ypl<A, O<j<m-1, 
and so their average is also less than A. Therefore, if 
yo-Aa<yj<y+Aa, O<j<m-1 


then also yo — Aa < ym < yo, and so each y; is such that |y; — yo| < Aa. 

It follows that for x9 < x < x9 +a and yo — Aa < y < yo + Aa the functions 
f(x, y) and f,(x, y) are continuous and bounded; and | f(x, y)| < A. 

Therefore, the limit as the x; get closer and closer together exists and y — yo = 
i J (x, y)dx is a solution of the differential equation that satisfies the given 
initial conditions. 

Cauchy did not stop with this local result. He sought conditions which would 
allow him to prolong the solutions out of the neighbourhood in which they have 
been shown to exist. He found that this could only be done indefinitely if certain 
necessary conditions were met, and gave examples to show that some differential 
equations have solutions that only exist for a limited range of the variable x. We 


can note this simple one: the differential equation ae 1+ y’, for which x = 0 


implies y = 0, has the solution y = tan x, but this is only defined on the interval 
(—7/2, 1/2). 
We should also note that these theorems give sufficient conditions for a solution 
to exist, but they are not necessary. 
Before we proceed, it will help to look at two examples. We start with the simple 
case 
dy x 


’ 


dx y 


with the initial conditions that at x = 0 we have y = a. 
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Suppose we use the theorems to look for a solution for which when x = 0 we 
have y = |. The theorems tell us that there is a unique solution valid in some neigh- 
bourhood of the point (x, y) = (0, 1). 

But if we use the theorems to look for a solution for which when x = 1 we have 
y = 0 then the theorems tell us nothing, because in any neighbourhood of the point 
(1, 0) ; is unbounded. 

These results correspond to the fact that the solution of the differential equation 
in each case is x? + y? = 1, and in the first case the initial conditions imply that the 
unique solution in this case is y = (1 — x”)!/?, whereas there is no single-valued 
solution to the differential equation in any neighbourhood of the point (1, 0). This 
also brings up the important point that for Cauchy a solution to a differential equation 
may be given only implicitly, in the form f(x, y) = 0, and not explicitly (as y = 
y(x)). 


Now consider the first-order ordinary differential equation 


dy 1 
dx x+y 


It looks as if something could go “wrong” when x + y = 0; let us see what happens. 
We can solve this equation explicitly by writing 


x+y=z, sodx+dy= dz, 


when the equation becomes 


dz il 

dx = 
or 

dz 1+z 

dx —-z 


so 
x =z-—loga(1+z), 


where a is an arbitrary positive constant. This equation, in the original variables, says 
x=x+y-—logal+x+y), 


or 
y=logal+x+y), 
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Fig. 12.2. The graph of 
e =2(1+x+y) 


or 


e=a(l+x+y). 


This says that when x + y = 0 that y = loga, which, unexpectedly, is a constant. 
What has happened is illustrated in Fig. 12.2, where a = 2. It is clear that y cannot be 
a single-valued function of x in a neighbourhood of the point where —x = y = log 2. 


d 
The figure confirms what the “infinite” value of a already said: the implicitly 


defined function that is the solution of the differential equation cannot be written as 
a function y of x in a neighbourhood of the “bad” point. 

It might seem that Cauchy paid a high price for rigour. Whereas earlier mathemati- 
cians offered general formulae, he could only provide solutions in a neighbourhood 
of an initial point where the differential equation was well behaved. This is, of course, 
entirely of a piece with his theory of functions, in which, for example, power series 
may converge only for a limited range of the variable. 

These lectures by Cauchy provide the first proof of the existence of solutions 
(locally, at least) to a first-order differential equation. The historian of mathematics 
has a particular interest in them because they illuminate the frailty of our knowledge 
of even the recent past. Unlike much of Cauchy’s work, they are not contained in 
the 31 volumes of his Oeuvres complétes. They are not even listed by the editors 
of those works as being among the texts omitted from the collected works. It was 
known that they had existed only because Cauchy mentioned them in a resumé of 
1835, and because his friend the Abbé Moigno gave an account of them in his book 
of 1841. Only in 1979 did Gilain track down a printed copy of the first 13 lectures 
in the archives of the Institut de France. 
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The disappearance of these lectures is hard to explain, because Cauchy was an 
energetic publisher and republisher of his own results. A crucial factor is likely to 
have been Cauchy’s discovery of a different proof in 1835, one he did publish (twice) 
and which did catch on. 

By then, Cauchy was in Prague, officially as a tutor to the Bourbon Dauphin 
(who, unsurprisingly, had no interest in learning what Cauchy had to say). In the 
introduction to his Prague paper, Cauchy stated that the integration of differential 
equations by series was “illusory, so long as one did not provide any means of assuring 
that the series so obtained were convergent”.! This emphasis on series solutions is a 
sign of an important departure from this earlier presentation, although he did claim 
that his new method shared the advantages of the earlier one. But now he worked from 
the start with systems of first-order ordinary differential equations satisfying given 
initial conditions. Following what he called Hamilton’s “wonderful paper” of 1834 
on the differential equations of dynamics (see Sect. 24.3), he reduced the system 
of ordinary differential equations to a single first-order linear partial differential 
equation. He then showed how to find enough particular solutions of the partial 
differential equation to find the general solution of the original ordinary differential 
equation. He then used his rigorous methods of analysis to show that the particular 
integrals could be expanded in convergent power series. 

What is curious about this paper is that the theoretical passages treat the indepen- 
dent variable x as a complex variable, but in the examples he gave to illustrate the 
theory the variable x is regarded as real. In their comments on this, the historians 
Bottazzini and Gray remark that: 


Apparently, even as late as this Cauchy did not seem to recognize the deep difference between 
the real and the complex case of his existence theorems, and the ambiguity between the 
real and complex ran through his entire paper, so much so that in his concluding remark he 
proudly stated that his new theorems could “easily” be extended to the solution of differential 
equations in which the variables and functions involved become imaginary. Precisely for this 
reason he preferred his second existence theorem based on the calculus of limits, which had 
transformed the integration of differential equations into a rigorous theory. 


So in this paper, and indeed in a number of papers he went on to write that 
drew on this paper, the condition on the functions that enter the differential equation 
is that they are complex analytic. This was to turn out to be a very much stricter 
condition than being infinitely differentiable, and very much stronger than the modest 
conditions on differential equations of a real variable that Cauchy had assumed in 
1821. The only reasons Cauchy can have preferred his Prague paper to his earlier 
account are that either he did not appreciate the difference in the conditions, or he 
thought that the more interesting case was functions of a complex variable anyway. 
The former reason is plausible for the period. The latter one is also plausible and it 
fits to the intermittent but deep interest Cauchy had in establishing a rigorous theory 
of functions of a complex variable. 


‘See Cauchy ([35], 400). 
2See Bottazzini and Gray ({22], 162). 
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12.2.1 Later Developments 


Cauchy’s early emphasis on the local theory of real ordinary differential equations 
was taken up and polished by Rudolf Lipschitz and Emile Picard to give a sound 
account, using different methods, of the existence of solutions to ordinary differential 
equations. For a brief look of the creation of the modern theory of real ordinary 
differential equations, see Appendix G. Picard’s account is the final part of his study 
[211] of partial differential equations, which we shall look at in Chap. 28. 


12.3. Exercises 


1. Show that Cauchy’s theorems 7 and 8 do not apply to the differential equation 
y’ = y!/?, What happens in this case? 

2. Show that Cauchy’s theorems 7 and 8 do not apply to the differential equation 
y’ = ~, What happens in this case? 


x 


Questions 


1. Cauchy gave these lectures around the time he gave the famous courses in which 
his new approach to analysis was put on record for the first time. Which of the 
ideas introduced in theorems 7 and 8 do you think need further investigation 
before they could be said to be rigorous? 

2. Ask yourself this question again when you have seen Cauchy’s lectures from 1819 
on first-order partial differential equations in Chap. 17. 


Chapter 13 ®) 
Riemann and Complex Function Theory ne 


13.1 Introduction 


Complex function theory was relatively new in the mid-nineteenth century.’ After 
decades of intermittent interest, Cauchy began to draw his insights together in the late 
1840s so that a younger generation could appreciate them, and more-or-less inde- 
pendently Riemann (Fig. 13.1) began to develop his theory, starting in 1851. This 
chapter looks specifically at Riemann’s approach to the subject, which greatly trans- 
formed it through a balance of intuitive and often geometrical ideas and a profound 
connection to the theory of harmonic functions, although at a cost in lack of rigour 
that some were to feel was too high to pay. 


13.2 Complex Function Theory 


In this section, some of the more important results will be indicated. The more 
intuitive ones will be explained; others, which require a more careful explanation, 
are standard in any book on complex function theory, and were given various accounts 
in the nineteenth century. 

Riemann had begun his [234] with an insight that it had taken Cauchy years to get 
clear: a function f of two real variables x and y that is a complex-valued function 
of a single complex variable z = x + iy is best understood (and had always been 
informally understood) to be a function on a domain in the complex plane on which 
it is complex differentiable. That is, if and only if the limit of the quotient 


f(z+dz) — f(z) 
dz 


'See Bottazzini and Gray [22] for a comprehensive history. 
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Fig. 13.1 Bernhard 
Riemann (1826-1866) c. 
1863. Riemann, Gesammelte 
mathematische Werke, 3rd. 
edn. 1990 


exists as dz tends to zero and does not depend on the direction of dz—or, to put that 
point another way, if it does not depend on the components dx and dy. 

By letting dz = dx and then letting dz = idy and equating the results, we obtain 
the partial differential equation 


0 0 
Jae 
ox dy 
from which, on writing f(x, y) = u(x, y) +iv(x, y), the familiar Cauchy—Riemann 
equations are obtained: 

Ux = Vy; Uy = —Vy. 


They form a system of two coupled first-order partial differential equations, and 
it is interesting to note that although Riemann was willing to base his new theory 
of complex analytic functions on them Weierstrass later was not—an indication of 
Weierstrass’s preference for power series, to be sure, but also an indication of how 
opinions stood on the subject of partial differential equations in the mid- to late 
nineteenth century.” 

Riemann appreciated, but Cauchy did not, that these equations imply that at points 
where the derivative of f does not vanish f are conformal (angle preserving), because 
the differential can then be written as 


Ux Uy \ _ ( Ux Uy Ce cost — sint 
Vy Vy —Uy Ux sint cost }’ 


for a suitable function g(z) and a parameter f. 


?For more on this, see Bottazzini and Gray ({22], Chap. 6). The connection with Dirichlet’s principle 
is explored in Sect. 19.2. 
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Riemann also noted that if you differentiate these equations you obtain the equa- 
tions 
Uxx + Uyy = 0, Uy. + Vyy = 0, 


so u and v, the real and imaginary parts of f, are harmonic functions. He might well 
have learned of the conformal nature of complex analytic functions from Gauss, 
who had observed this fact in 1822, and he surely learned to appreciate harmonic 
functions from his mentor Dirichlet, so much so that much of Riemann’s theory of 
analytic functions is derived from a study of harmonic functions. 

It follows from the work of Cauchy that a function that is complex differentiable 
is infinitely differentiable and can be written as a power series in a neighbourhood 
of a point where it takes a finite value.? Riemann knew this, but generally preferred 
to treat complex functions geometrically. 


13.3. The Riemann Mapping Theorem 


Riemann believed that the Dirichlet problem (see Sect. 18.3) had a solution. In other 
words, given a simply connected region T and a continuous function defined on 
the boundary of T, the function has a unique continuous extension to a harmonic 
function u defined on the interior of T. It followed that there was a unique complex 
differentiable function u + iv once the conjugate function was specified at a point.* 

He gave a proof of this claim that was inadequate, because the theoretical under- 
standing of the Dirichlet problem was too poor, and the claim divided mathematicians 
for a generation. Some found it almost certainly true and worth using until a rigor- 
ous proof came along, and at the other extreme some found it beyond hope. (It was 
eventually established under very general conditions.) 

Riemann used it as the basis of an argument for a much stronger and deeper result 
that became called the Riemann mapping theorem”: 


Theorem 13.2 The Riemann mapping theorem: Any two simply connected regions 
are not only topologically equivalent but analytically equivalent. 


(Riemann assumed that such regions are bounded by curves homeomorphic to circles, 
which need not be the case, but topology was being pulled into existence by this and 
other papers of his.) His proof that any two such regions, with boundaries as described, 
are in fact topologically equivalent was already novel. That they are analytically 
equivalent means that there is essentially only one simply connected domain for the 
purposes of complex function theory, and that can be taken as the unit disc. 


3For the complicated history of this result, known today as Cauchy’s integral theorem, see Bottazzini 
and Gray [22]. 


4See Riemann ({234], Sect. 19). 


>Riemann only had in mind domains whose boundaries are topological circles; questions about 
boundaries were only investigated much later. 
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Fig. 13.2 The initial stages 
of the mapping theorem 


This was immediately seen to be a powerful result, but Riemann’s argument for 
it (in Sect. 21) was naive, and attracted repeated attempts to prove it, some of which 
we shall meet below. 

His argument, somewhat over-simplified, went as follows (see Fig. 13.2). Pick a 
point Zo in the interior of 7, and let © be a small disc centre zg that lies entirely in T. 
Consider the function log(z — zo) on © cut along a radius @, so that the branch of the 
log function jumps by —2z7 as it crosses £ clockwise. Extend ¢ to a curve (also called 
£) that does not cross itself and reaches to the boundary of T. Extend the function 
to a continuous complex function f(z) on the whole of T that is purely imaginary 
on the boundary of 7, and likewise jumps by —2z7i as it crosses £. Therefore, the 
imaginary part of this function goes from 0 to 2zri as z completes a circuit of the 
boundary of T clockwise. Show that the integral of f(z) around the boundary of © 
is zero, and over all of 7 is finite. Deduce that it is possible to define what is called 
a Green’s function (see Sect. 18.2): a function g(z) that is infinite at zo and has zero 
real part on the boundary of T. 

On €@ the value of the real part of h(z) = f(z) + g(z) goes from —oo at zp to 
0 where £ meets the boundary of 7. Let —oo <a <0, and look at C, = {z € 
T | Re(h(z)) = a} (see Fig. 13.3). Riemann claimed that this can only be a single 
loop that does not cross itself, because T is simply connected. So T is filled out by 
these loops, which are the level curves of the real part of h(z), and they do not inter- 
sect each other. The function e’ maps the boundary of 7, which is the outermost 
of these loops, onto the unit circle, and all the other loops onto concentric circles; 
the point zo is mapped to the centre of the circle. 

Therefore, T is analytically equivalent to the unit disc, and therefore any two 
simply connected regions are analytically equivalent. 
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Fig. 13.3. An indication of 
the final stages of the 
mapping theorem 


13.4 A Look Ahead 


Chapter 14 discusses how Riemann tackled the hypergeometric equation. His analysis 
focussed on the fact that the coefficients of the equation are infinite at precisely three 
points: z = 0, 1, oo, and therefore very likely the solutions, which will otherwise 
be finite, will be infinite at those points. In fact, as Gauss had shown, the solutions 
take the form of a power of the variable z times a convergent power series, or a 
power of | — z times a convergent power series. So Riemann, who knew Gauss’s 
papers [114, 115] very well, could see in advance that the solutions were likely to 
be branched at z = 0, 1, 00 and would usually be what were called in the nineteenth 
century many-valued functions. 

Riemann could also see, as Kummer had done, that Gauss was hinting that the 
transformations z > 1/z and z > | — z were particularly relevant, and this is con- 
nected to what are called Mébius transformations, which have the special property 
of mapping triangles whose sides are arcs of circles to other circular-arc triangles in 
an angle-preserving way. 

It was also becoming clear to mathematicians by 1850 that although a power 
series is convergent only inside some circular disc (which might be the whole plane 
of complex numbers) the complex analytic function it defines might be defined on 
a much larger region (think of 1 + z+ 72+... and (1 — z)~!). For Weierstrass, in 
particular, this invited the question of how a complex function could be defined on a 
family of overlapping discs, and this led to a theory of analytic continuation. 

Finally (for now), it was discovered that if a (non-constant) complex function is 
defined everywhere including oo then it would have to take the value oo somewhere 
(this can be handled with elementary limiting arguments). For example, the simplest 
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functions on the complex plane with a point at infinity are of the form 


az+b 


a 
cz +d 


where a, b, c, d are constants and ad — bc # 0 (the Mébius transformations). They 
take the value 1/c when z = oo and the value oo when z = —d/c. Discovery of this 
fact about complex functions is disputed between Hermite and Liouville, to whom 
it more properly belongs, and Cauchy® It was also known to Riemann, who may 
perhaps have discovered it for himself. 

For convenience, a brief account of each of these topics will be found in 
Appendix E. 

The results we need about M6bius transformations are simple, and concern their 
effect on straight lines and circles. As shown in Appendix F, a Mébius transformation 
maps a straight line either to a straight line or a circle, and it also maps a circle to 
either a straight line or a circle. Informally in this context mathematicians used to 
think of a straight line as a circle passing through oo. With that convention in place, 
we can say that a Mobius transformation which maps a circle through three given 
points to three other given points maps the circle through the first triple of points to 
the circle through the second triple. 


13.5 Exercises 


1. Describe the image of the upper half-plane under the two-valued “map” z +> z!/?. 
Extend your account to describe the image of the whole plane. 

2. Find all the Mobius transformations mapping the upper half-plane to itself. 

3. Find a Mébius transformation mapping the upper half-plane to the unit disc 
centred on the origin, and interpret it as an inversion. 

4. Hence, find all the Mobius transformations mapping the disc to itself. 


Questions 


1. One of the great divides in mathematics is between a real-valued function of a 
real variable being differentiable, and a complex-valued function of a complex 
variable being complex differentiable. Operationally, it is hidden behind the for- 
malism, which looks like nothing more than switching x with z. For as long as 
it seemed natural to mathematicians to believe that functions were differentiable, 
even infinitely differentiable, this distinction was even harder to see. What signs 
have you seen of it already, whether appreciated or overlooked by the mathemati- 
cians you are considering? 


See Liitzen ({192], Chap. XIII). 


Chapter 14 M®) 
Riemann and the Hypergeometric cio 
Equation 


14.1 Introduction 


Kummer’s 24 solutions to the hypergeometric equation in the form of a hypergeo- 
metric series inx, | — x, 1, 1 1 reg + , possibly multiplied by some powers 
of x and/or (1 — x), presents each solution in a form that restricted it to a certain 
domain and then gave the relationship between overlapping solutions. In his [235], 
Riemann proposed to apply his new, geometric methods to study the hypergeometric 
equation as an equation for functions of a complex variable, noting correctly but 
dramatically that these methods were essentially applicable to all linear differential 
equations with algebraic coefficients. 

Like Gauss, Riemann took the independent variable to be complex, and a possibly 
unexpected result of doing this is that what look like two independent complex 
solutions of the hypergeometric equation generally appear as two branches of the 
same many-valued function. Indeed, a typical way that branched or many-valued 
functions arise is in the solution of ordinary differential equations. 

The importance of Riemann’s work is both general and specific: general in that 
it opened up the theory of ordinary differential equations to complex functions of a 
complex variable, and specific in that it showed that equations like the hypergeometric 
equation can be recovered completely from a knowledge of the behaviour of their 
solutions under analytic continuation. 


14.1.1 Ordinary Differential Equations and Many- Valued 
Functions 


First, we recall some elementary facts about the solution of the hypergeometric 
equation. We shall suppose that in the hypergeometric equation 
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\ dw 1 dw 9 
x( —Maa t-lat+8+ )x) 7 O8w = 


the variable x is complex. The standard method for solving this equation, and any 
linear ordinary differential equation, in a neighbourhood of the origin is to substitute 
a power series of the form 


agaist ax” bok) 


and to look at the coefficient of the lowest power of x. This will be an equation for 
A, and in this case the equation is 


MA—14+7) =0, 


so 
A=O0orl—y. 


The recurrence relation for the coefficients then gives solutions in the form 
y = F(a, B, y,x) and x! 7F(a—y+1,6—74+1,2-7,x). 


These solutions form a basis of solutions for the differential equation in a neigh- 
bourhood of the origin. For use below let us call these solutions yo; and yo. 


14.1.1.1 Result 1 


We shall now show that, certain exceptional cases aside, the solutions of the hyperge- 
ometric equation in the complex case are all branches of the same complex function. 

By what we said earlier, if the variable x goes on a small circle around the point 
x =0 the solution yo; returns to its original value, and the solution yo returns 
multiplied by e?"!'-7, This does not allow us to infer that the two solutions are 
branches of the same function, so we must look further afield. For convenience, and 
before we proceed, we write this information as a matrix equation (writing y for the 


value of y at e?"“) 
yr \ (1 0 You 
(=) = (3 put) (C*) , wap. 


For brevity, we write this as 
Yo = Doyo. 


Had we looked for solutions in a neighbourhood of x = 1 the same method would 
have produced solutions that are power series in 1 — x possibly multiplied by a 
power of 1 — x, and indeed we would have found that a basis of solutions in this 
neighbourhood is given by 
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F(a, B,a+8—y+1,1—x) and (1 —x)° 8 F(y a,y—-By-a-B4+1,1—-x). 


We shall call these solutions y,; and yj2. 

The radius of convergence of all the four power series we have written down is 1, 
and this means that in a neighbourhood of x = 5. say, we have four solutions, and 
therefore two are linearly expressible in terms of the other two. Let us say that 


Yor = 41111 + 412912 
Yor = 421 Y11 + 422912, 


where the coefficients a ;, are constants. 
We can write this information in matrix form as 


Ga = & 2) ey (14.2) 
Yo2 a21 422 12 
For brevity, we write this as 

Yo = Ay}. 


We now propose to look at what happens to the solutions yo; and yo as x is taken 


on a small circle around the point x = 1. We can do this by watching what happens 
as y1; and yj undergo the same journey. We know what they do individually, by 


analogy with Eq. 14.1: 
Ve oe a 0 Yu 
(3) aol € pure) 3) : (14.3) 


Again, for brevity, we write this as 
yi = Diy. 


So conducting x around the point x = | returns yo; and yo2 as 


1 0 41 412 \ [ You 
(4 guerep | (< ) Ge ; (14.4) 


D, Ayo. 


which is 


But this expresses them in terms of y,; and y,2. To express the answer in terms of 
yo1 and yo2, we must use the inverse of the transformation in Eq. (14.2). This inverse 
is given by the matrix 


AtD= 1 422. —a)2 
a\1d22 — aj2a2; \ —421 411 
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and so the final result is that 
Yo = A Di Ayo. 


Written out in full, this says 


Yor) _ 1 ae an \ fl. 0. a1 412 \ ( You 
Yor 11422 — 442d, \ 421 1 0 eos) a2 ar2 yor 


(14.5) 


To conclude that yo; and yo2 are branches of the same function, it is enough to 
show that the matrix 


ay —ay2\(1 OO  \ (ana 
—a21 a 0 2-9 1 \ an an 


is not diagonal, and a routine calculation shows that it will be diagonal if and only if 


eet (y-a-—) = 1, 


that is, if and only if y — a — (7 is an integer. 
A very similar calculation involving a small circuit around x = oo can be carried 
out. In this case, the basis of solutions valid near x = oo that we choose is 


Yoo! =x *F(a,a—y+la—B4+1,1/x), yor =x FP F(B, B—y+1,8—a+4+1,1/x). 


Under analytic continuation around x = oo these solutions return as 


—2nia —2niB 


Yool =e Yoo! and Yoo2 =e 


a —2Tia 

Yoo! e 0 Yo 

% = + : 14. 
ee) ( 0 ms | Ce) =e) 


So if we repeat the above calculation but extending a basis of solutions valid near 
x = © analytically around the point z = | we find that the corresponding matrix is 
diagonal, and the two solutions are not branches of the same function, if and only if 
a= p. 

Likewise, if we extend a basis of solutions near x = | we find that the members 
of the basis are branches of the same function unless y = 1. 

We conclude that, these special cases aside, the members of a basis of solutions 
to the hypergeometric equation turn out, under analytic continuation, to be branches 
of the same function. 


Yools 


which we can write as 
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14.1.2 The Riemann Sphere 


One of Riemann’s simple but productive innovations was to work with the plane 
of complex numbers augmented by a point at infinity. This gave him a sphere, and 
he regarded the connection between the plane and the sphere as being given by 
stereographic projection. This is a map from the sphere to a tangent plane at a point 
S (which we shall call the South Pole) that is defined as follows. Let N (the North 
Pole) be the point on the sphere diametrically opposite to S, and let P be any point 
on the sphere other than NV. The map from the sphere to the plane maps P to P’, the 
point where the line NV P meets the plane (Fig. 14.1). 


Fig. 14.1 Stereographic projection of the Riemann sphere onto the plane, Anschauliche Geometrie, 
2nd edition, Springer, 1996 


14.1.2.1 Result 2 


We shall now see that analytically continuing a solution in a loop that winds once 
around all three branch points returns the original solution. 

Suppose that we have a function with exactly three branch points, the points 
z= a,b,c or, if you prefer, the points z = 0, 1, oo. This means that as the function 
is continued analytically around the point z = a it returns multiplied by a factor 
e?!® as it is continued analytically around the point z = b it returns multiplied by 
a factor e?'7, and as it is continued analytically around the point z = c it returns 
multiplied by a factor e?""7, 

Suppose it is continued analytically around the point z = a and then around the 
point z = b. It now returns multiplied by a factor e?%e? = e? (+9), By the 
deformation principle, you can think of this path as consisting of a loop starting at 
a point P (other than one of the branch points) that goes around the point z = a, 
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returns to P, and then goes around the point z = b and returns to P, or as a loop that 
goes both a and b. 

If it is continued analytically around the point z = a, then around the point z = b, 
and then around the point z = c it returns multiplied by a factor 


: - : ie 
e2Ma 9218 62717 = e2tilatB+y) | 


But now something more can be said. For simplicity, suppose that we take a, b, and 
c to be points on the Riemann sphere. 

The following argument is clearest if we suppose that the three points are near 
the North Pole, and the path around them lies entirely south of them. Now we can 
imagine that the path is gradually deformed by moving it south until it lies arbitrarily 
close to the South Pole. It is now clear that continuing the function along this path 
returns it unchanged, and by the deformation principle stated above this means that 
the function returned unchanged along its original path. Therefore, a + @ + y must 
be an integer. 


14.2. Riemann’s P-Functions 


Riemann began his [235] by specifying geometrically the functions he intended to 
study. Any such function P is to satisfy the following three properties: ! 


1. It has three distinct branch points at a, b, and c, but each branch is finite at all 
other points” 

2. A linear relation with constant coefficients exists between any three branches 
P’, P’, P” of the function c’ P’ +c’ P" +c" P". 

3. There are constants a and a’, called the exponents, associated with the branch 
point a, such that P can be written as a linear combination of two branches P® 
and P° near a, (z —a)-°P® and (z — a)~% P® are single-valued, and neither 
zero nor infinite at a. Similar conditions hold at b and c with constants @, @’ and 
y, 7’, respectively. 


To eliminate troublesome special cases, Riemann further assumed that none of 
a, a’, 3, 8’, y or 7’ are integers, and that the suma+a’+6+(6'+y+7=1. 
He denoted such a function of z 


when (a, b, c) = (0, 00, 1). 


'Riemann’s notation comes from his reading of Gauss’s still unpublished paper, which Riemann 
explained in his note (1857b) he had read in Gauss’s Nachlass. Gauss had died in 1855. 

>This corrects a rare slip in the English translation of Riemann’s work; Riemann said, in somewhat 
obscure words, that the P-function is locally single-valued except at the point a, b, c. 
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In terms of the singular points at a, b, and c the first and third conditions say, for 
example, that P‘ is branched like (x — a)*. The second condition says that there 
are at most two linearly independent determinations of the function under analytic 
continuation of the various separate branches. One of the results Riemann established 
is that, as with differential equations, this information specifies a P-function up to a 
constant multiple. 

The analogy between P-functions and hypergeometric functions becomes clear 
if we take a = 0, b = 1, and c = ov. There are two linearly independent solutions 
of the hypergeometric equation at each singular point; they are branched according 
to certain expressions in a, 3, and y; and any three solutions are linearly dependent. 
Riemann showed that information of this kind about the solutions determines the dif- 
ferential equation completely, which goes some way to explain the great significance 
of the equation, as well as its special character. 

It will be enough to record the results that he achieved. A more detailed look at 
how Riemann arrived at them is given in the next section of this chapter. 


Result 3: If P; is another P-function branched at the same three points a, b, c and 
with the same six exponents a, a’, 3, 3’, y, y' then P; is a constant multiple of P. 


Result 4 If P and P, are two P-functions with the same branch points a, b, c and 
exponents the corresponding pairs of which differ by integers, then P; is obtained 
from P by multiplying by powers of z — a, z — b, and z — c, the precise powers 
being determined by the differences in the exponents. 


Result 5 A P-function satisfies a differential equation and when a, b,c = 0, 1, co 
and the exponents of the P-function are suitably chosen the differential equation is 
precisely the hypergeometric equation. 


The conclusion is that, just liken algebraic function, a P-function is specified up 
to a constant multiple by the information about its branching. 

The most immediate and important difference between Riemann’s approach and 
that of his predecessors is the relative lack of computation. As he remarked, his new 
method “allows results that were formerly obtained in part only after somewhat trou- 
blesome calculations to be derived almost immediately from the definition”.* Rather 
than starting from a hypergeometric series he began with a P-function having oo” 
branches and three branch points. To be sure, any two linearly independent branches 
have expansions as hypergeometric series, and any three branches are linearly depen- 
dent, but the argument employed by Riemann inverted that of Gauss and Kummer. 
His starting point was the set of solutions, functions which are shown to satisfy a 
certain type of equation. Their starting point was the equation from which a range of 
solutions are derived. Riemann showed that a very small amount of information, the 
six exponents at the branch points, entirely characterises the equation and defines the 
behaviour of the solutions. The hypergeometric equation is special in this respect, 


3See Riemann ([235], 3). 
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as Fuchs [110] was able to explain, and consequently the task of generalising the 
theory to cope with other differential equations was to be quite difficult. 


14.3. Riemann’s Arguments 


The analytic continuation of a P-function is determined by what happens at the 
branch points, because any closed path can be written as a product of loops around 
a, b, and c. When two linearly independent branches P’ and P’, say, are continued 
analytically in a loop around the branch point a in the positive (anti-clockwise) 
direction they return as two other branches, P' and P’, say. But then, by the second 
defining property of a P-function, 


P' =a,P'+aP_ 
Pp = a3P’ +P 


‘ a\ 
for some constants a1, d2, 43, 44,80 the matrix A = nn 
3 4 
at the point a. This is very like what was described in Result 1. 
Let B and C be the matrices which describe the behaviour of P’ and P’ under 
analytic continuation around b and c, respectively. Then, as Riemann noted (in line 


with Result 2), a circuit of a and b can be regarded as a circuit of c in the opposite 
direction, so 
10 
coa= (19). 


So, as Riemann remarked “the coefficients of A, B and C completely determine the 
periodicity of the function”. 

Now, for definiteness, Riemann supposed a=0, b = oo, c = 1, andchose branches 
P®, P,P), etc. as in (3). A circuit around a in the positive direction returns 
P© as e270 PO and P@ as e271 PO) go 


e2nia 0) 
A= ( 0) gna’) 


To express the effect on P and P of a circuit around b = 00, he replaced them 
by their expressions in terms of P and P“*), conducted the new expressions around 
oo, and then changed them back into P and P by writing 


PO Po 
(pe) = B' (po ; 


i ) describes what happens 


Then 
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2nia 
e 0 —1 
a= 8 ( gn ) B ; 


with a similar expression for what happens near c. 
Since CBA = [, it follows on taking determinants that 


det(A) det(B) det(C) = 1 = e? Mt + 8484947) (14.7) 


which is why Riemann had assumed a+ a+ 64+ 8 +y+7 =1. 

Equation (14.7) is an equation in 2 x 2 matrices, so it yields four equations for 
the eight entries in the two matrices B’ and C’ in terms of the six parameters of the 
P-function a, a’,..., 7. 

Riemann wrote the equations out explicitly, and showed that four of the eight 
entries determine the other four. Indeed, he did better: in July 1856 he wrote down 
how to express the entries in the matrices B’ and C’ in terms of the six coefficients 
a, a’, 3, 2’, 7, 7’, but in the published paper he merely gave the various ratios a 
etc. 

These were enough to prove the next result, Result 3: 


Result 3 


If P, is another P-function branched at the same three points a, b,c and with the 
same six exponents a, a’, 3, 3’, y, 7 then P, is a constant multiple of P. 

This makes sense because, on the one hand, the branches of P and P; near z = a 
behave in the same way, and, on the other hand, the analytic continuation of the 
branches of P and P; around the other singular points is given by the same matrices, 
so they should behave in exactly the same way everywhere. 

More precisely, Riemann first showed that the ratio P{*/ P° is constant in a neigh- 
bourhood of z = a and then because the exponents are the same the analytic contin- 
uation of the branch P/* is the same as that of P and so P is a constant multiple of 
P. This is where he made tacit use of Liouville’s theorem (see Sect. E.4). 

As Riemann then remarked, a very similar argument deals with two P-functions 
with the same branch points a, b, c and exponents the corresponding pairs of which 
differ by integers.* Now, although the analytic continuation is the same for the two 
functions, the quotient P/*/P® is not constant in a neighbourhood of z = a, and 
instead (z — a)? P//P° is constant in that neighbourhood for a suitable integer power 
6. This gave him Result 4: 


Result 4 


If P and P; are two P-functions with the same branch points a, b, c and exponents the 
corresponding pairs of which differ by integers, then P; is obtained from P by mul- 
tiplying by powers of z — a, z — b, and z — c, the precise powers being determined 
by the differences in the exponents. 


4These are Riemann’s versions of Gauss’s contiguous functions. 
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Only now did Riemann satisfy himself that P-functions exist. He did this in Sect. 
7, where he deduced the next result: 


Result 5 


A P-function satisfies a differential equation and when a, b,c = 0, 1, oo and the 
exponents of the P-function are suitably chosen the differential equationis precisely 


the hypergeometric equation. 
2 


d d 
More precisely, P = y, P; = am and P) = ee are three such P-functions and, 


x Xx 
as Riemann showed, they satisfy a linear relationship with coefficients that are certain 
rational functions in x. Explicitly, in the case y = 0, he found that P satisfies the 
hypergeometric equation in this form’: 

2 


d*y dy 
—— -(A+B 
og 27 a Figs 


+ (A’— B’z)y =0. 


Accordingly, Riemann connected his P-functions with the functions F(a, G3, ¥y, z) 
of Gauss: 


0 a 0 
= (a) 
F(a, 8, y, Zz) = const P (eee 8) 


So Riemann had shown that the branching data of the P-function determined 
its monodromy relations, i e. the group generated by the matrices A, B, C.° Fur- 
thermore, Riemann had established that the hypergeometric equation is the only 
second-order linear equation whose solutions satisfy the geometric conditions of his 
three postulates. 

Riemann concluded by illuminating the relationship between P and F, its hyper- 
geometric series representation. Since a and a’ may be interchanged 


2) )@= eee 


5 tha: wat d : ad od a dad, 2a 

The logarithmic derivative > Tors satisfies 7 foes ae and Tlogx? = ax +” Gz aS you can 
: = ; df _ df dx 

check by setting u = log x and using the fact that 77 = 3; 7, - 


®Monodromy matrices were introduced by Hermite in response to Puiseux [229], a paper that had 
carefully examined the effect of analytic continuation on a branch of an algebraic function around 
one of its branch points with a view to elucidating the integration an algebraic function over a 
closed path containing a branch point; Cauchy reported on this work in Cauchy [38]. Riemann did 
not cite this work; as was the custom of the time, Riemann seldom gave references—except, in his 
case, to Gauss and Dirichlet—but he was well read all the same, and was undoubtedly one of those 
mathematicians who absorbed the work of others and then rederived it in his own way. The term 
monodromy group was first used by Jordan in his Traité ([154], 278) and its subsequent popularity 
derives from its successful use by Jordan and Klein in the 1870s. 
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there are eight P-functions for each hypergeometric series in z, say 
P(%87) @=z0 F ‘41 
al By gHercd—z2yF(B+at+yF+at+y,a-—a' +1,z). 


There are six choices of variable, so 48 representations of a function as a P-function. 


14.4 Exercises 


1. Look back at the table of Kummer’s 24 solutions to the hypergeometric equation 
and determine their domain of convergences as functions of a complex variable. 

2. Find values of a, 3, and y so that these solutions are only two- or three-, or 
five-valued functions. 


Questions 


1. The properties of Riemann’s P-functions are obviously derived from the prop- 
erties of solutions of the hypergeometric equation. In what sense are they the 
essential properties? 


Chapter 15 M®) 
Schwarz and the Complex cies 
Hypergeometric Equation 


15.1 Introduction 


A second-order linear ordinary differential equation has a basis of two solutions, and 
it was to turn out that their quotient has interesting properties. Indeed, the set-theoretic 
inverse function has still more interesting properties that bring out a strong family 
resemblance between the regular or Platonic solids, plane lattices, and tessellations 
of the non-Euclidean disc. 


15.2 Quotients of Solutions 


First, we write the hypergeometric equation, or indeed an arbitrary second-order 
linear ordinary differential equation, in the form 


f" + pf +f =9, (15.1) 


where p and g are functions of z. 
To investigate the behaviour of two linearly independent solutions u and v of 
Eq. (15.1), it turned out to be convenient to introduce their quotient w = u/v. Then 


, wv—u0 
vo 
It is important to observe that 


/ / 


r ! —_ We. _ 
w=08Suv—-uv =08 —=— Su=kr, 
u v 
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for some constant k. Because u and v are a basis of solutions, we infer that w’ never 
vanishes and so w is locally one-to-one away, that is, from any singularities of u or 
v. 

Now we restrict our attention to the hypergeometric equation, for which 


<a ar eee. ap 
~ z(1—z) oa z(1—z)’ 


We are going to follow Schwarz and make some assumptions about a, 3, and y as 
the argument proceeds. We start by assuming that they are all real. This means that 
the solutions of the hypergeometric equation take real values when z is restricted to 
real values. 

In 1872, Schwarz investigated the question of when the solutions to this differential 
equation are algebraic functions of z. That is to say, they are functions w(z) such 
that there is polynomial equation F(z, w) = 0. (For example, in real variables, the 
function f(x, y) = x? + y? — lis an algebraic function of x and y.) This means that 
for every value of z there is a finite number of values of w. We have seen that the 
quotient can be made to take real values, so the images of the segments (—oo, 0), 
(0, 1), and (1, oo) can be real. 

Schwarz knew from Gauss’s work and Riemann’s that the solutions of the hyper- 
geometric equation have three singular points, at 0, 1, oo, and that the solutions are 
of the form x“ fo(z) near x = 0, (1 — x)* fii — x) near x = 1, and (1/x)* foo.(1/x) 
near x = oo, where the exponent a is one of the numbers in Kummer’s table and the 
fs are analytic functions that do not vanish at the point in question. 

When a function is algebraic there are only finitely many values of w for each 
value of z. The quotient of two solutions of the hypergeometric equation, like the 
individual solutions, is usually many valued, because it is of the form z* times 
a holomorphic function. This immediately implies that the exponents a must be 
rational numbers, and this imposes conditions on the coefficients a, 3, y of the 
hypergeometric equation. Schwarz found that on setting 


(l-7 =», (a- BY =’, (¥-a- BY =v’, 


where A, ju, and v are real and positive, the image of the upper half-plane under a 
quotient of two solutions of the hypergeometric equation has angles of Az, yu, and 
v7. After imposing some further restrictions on a, 3, and y, he could even insist that 
A, pL, and v be reciprocals of integers. We shall now follow him part of the way. 

Schwarz observed that two linearly independent solutions of a hypergeometric 
equation that are each algebraic will have a quotient that is algebraic, because it can 
only have finitely many values at each point z. 

The quotient is singular at the points 0, 1, oo where the individual solutions are 
singular. Otherwise, because a, (3, and + are real, the hypergeometric equation has 
solutions that are real-valued on each of the segments (—oo, 0), (0, 1), and (1, oo) 
of the real axis. The same is, therefore, true of the quotient: it can take real values on 


15.2 Quotients of Solutions 173 


the real axis. But, because the quotient is many-valued, these segments have other 
images. 

Before we find what they are, let us look at the effect of the quotient of a neigh- 
bourhood of the singular points. The upper half line can be considered as a triangle 
with vertices at 0, 1, oo, where the angles at each vertex are 7. We can now see that 
the quotient w maps these angles to angles Az, p77, v7. 

Now we need to check some things: 


e that each segment is mapped monotonically onto its image, 
e that no two of the sides cross, and 
e that the upper half-plane is mapped onto the interior of the triangle. 


These properties hold because we are dealing with a second-order ordinary dif- 
ferential equation, and so we know that w’ can only vanish at the singular points, 
and elsewhere is one to one. In particular, it is monotonic on the segments. 

To find the images of the (—o«, 0), (0, 1), and (1, 00), in general, we let a basis of 
solutions consist of two functions, say u(z) and v(z), and look at the quotient w(z) = 
u(z)/v(z). Suppose the complex variable z is taken on a loop and returns to its starting 
point. The functions u(z) and v(z) return as new solutions, and therefore can be 
expressed as linear combinations of a(z) and v(z), say functions, say au(z) + bu(z) 
and cu(z) + dvu(z), where a, b,c, and d are constants determined by the loop. So 
the quotient returns as 

aw(z) +b 
cw(z) +d’ 


So w has been subject to a Mobius transformation. 

Now we use the fact that a Mobius transformation sends a straight line or a circle 
to a straight line or a circle (see Appendix F). So if the images of one of these 
segments can be a straight line segment, and any other image is the transform of that 
one by a Mobius transformation, then the other images are either straight lines or 
circles. 

Consider now the segments (—oo, 0) and (0, 1). They meet at an angle of 7 at 
x = 0, and locally the quotient is a complex function multiplied by a power of z, say 
z. So these two segments are mapped to straight lines meeting at an angle of 7A. 
What happens at the other two meeting points is this: the argument about the angle is 
exactly the same, but the segment joining them is the images of the real axis not by 
w but some MObius transformation of w, so it is a straight line or circular arc. As a 
result, the image of the whole upper plane is a straight-sided or circular-arc triangle 
with angles of 7A, 74, and rv. 

Analytic continuation moves this region around by M6bius transformations, and 
the result is a net of circular-arc triangles, which are the images of the upper and 
lower half-planes by w as z is conducted around the plane, avoiding the branch points 
atz =0, 1,0. 

It is one thing to know that the image of the upper (or lower) half-plane by the 
quotient w(z) is a circular-arc triangle, and another to know what happens as the 
variable z is led on one loop after another. We get a succession of images of the 
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Fig. 15.1 In every triangle 
the angles are either 7/2 or 
am /3 or 7/6, and cumulatively 
they cover the plane 


upper and lower half-planes and so a succession of circular-arc triangles. How do 
they fit together? Do they fit together like pieces of a jigsaw puzzle, or do they overlap 
any old how? 

The crucial consideration is the angles at the vertices. If one of the angles is, 
say 7/6, then it is reasonable to suspect that 12 will fit together at the vertex, each 
obtained from the one before by a turn through 7/6, and this is indeed what happens. 
If a vertical angle is 7/n for some integer n then 2n fit together at the vertex, each 
obtained from the one before by a turn through z/n. So if all three vertical angles 
are of the form 7/n for integers n then we expect that successive images form a web 
of triangles. But if any vertical angle is not of that form then complicated overlaps 
will occur, which is why Schwarz (and Poincaré after him) excluded those cases. 

Suppose, for example, that the quotient we are looking at maps the upper half- 
plane to a triangle with angles 7/2, 7/3, and 7/6. These angles sum to 7, so all three 
sides of the image will be straight. Suppose for definiteness that 


e the point z = 0 is mapped to the point A where the angle is 7/2. 
e the point z = | is mapped to the point B where the angle is 7/3. 
e the point z = oo is mapped to the point C where the angle is 7/6. 


Suppose that z goes on a path starting from z = i and exiting the upper half-plane 
between z = 0 and z = 1. Then the image of the lower half-plane will be a triangle 
ABC’ congruent to triangle ABC and attached along the edge AB. 

Let z re-enter the upper half-plane between z = | and z = ow. The new image of 
the upper half-plane will be another triangle BC’A’ congruent to ABC and attached 
to the previous one along the edge BC’. 

The process continues in the fashion shown in Fig. 15.1. 

Each light triangle is the image of the upper half-plane, and each dark triangle the 
image of the lower half-plane. 

The figure may equally well be regarded as a web of equilateral triangles (each 
containing three light and three dark triangles, such as ABD and BC D), correspond- 
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ing to a different hypergeometric equation in which the angles at the vertices are all 
T/3. 

It can also be seen as a web of parallelograms (made by joining two equilateral 
triangles together, such as ABC D). Now we have a figure with four vertices, so we 
are no longer dealing with the hypergeometric equation but with a different ordinary 
differential equation. 

In each of these cases, the point z = 7 has an image in the first, third, fifth, ..., 
triangles, and the point z = —i has an image in the second, fourth, sixth, ..., triangles. 
We get some sort of a map from the complex plane to the web of triangles we have 
constructed, but it is not easy to say what is the image of z = 7 because it seems 
to have infinitely many images. Suppose for a moment that they have been marked 
P|, P3, Ps,..., each of them in the same position in the appropriate triangle. 

What is much clearer is the set-theoretic inverse map, from the web of triangles or 
parallelograms back to the complex plane. This is reminiscent of the arcsin function. 
There are infinitely many angles whose sine is 5, but the sine function treats them 
alike: 


sin(1/6) = sin(1/6 + 2km) = ; = sin(S7/6 + 2km)... 


for any integer k. 

For definiteness, let us return to the example where the upper and lower half- 
planes are each mapped to triangles with angles 7/2, 7/3, and 7/6. In each light 
triangle is a point P);,,. The function that is the set-theoretic inverse of in this 
case maps them all to the point 7. Similarly, y has mapped any given point ¢ in the 
upper half-plane to a point in each light triangle, and the inverse function maps all 
those points back to ¢. 

In the parallelogram case, the simplest thing that can happen is that every point ¢ 
in the upper half-plane has two images in each parallelogram.! The inverse function 
is now one that takes every value twice (this is again reminiscent of the sine function). 
But what is more interesting is that the points where it takes a given value form into 
two families. In each family, the points form a lattice; they are separated by integer 
multiples of the lengths AB and AD. If we represent AB by the complex number 
w, and AD by the complex number w, then it follows that the inverse function—let 
us call it F—satisfies these equations: 


F(z) = F(z +0) = F(z +42) 


for every z. By analogy with the sine function, the function F is said to be doubly 
periodic. It is also an elliptic function.” 


'This is a consequence of the Cauchy integral theorem in complex function theory. 


>The efforts to identify doubly periodic and elliptic functions involved many mathematicians from 
Gauss to Riemann and Weierstrass. The connection between elliptic integrals—integrals of the form 
ipa a j where f(t) is a quartic in t—and elliptic functions was one of the great discoveries of 
Abel and Jacobi in the 1820s. See, for example, Botttazzini and Gray [22] and more briefly Gray 


[127]. 
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Fig. 15.2 In every triangle 
the angles are all 27/5, so the 
triangles are all congruent, in 
Nash ([203], 101), copyright 
Elsevier (2014) 


Now, Schwarz’s problem was to arrange for there to be only finitely many image 
triangles, so the condition he was led to discover was that \ + 4+ v > 1. The only 
triangles that meet this condition are the so-called digons (triangles with angles 
1/2, 7/2, 7/n that fit together like 21 segments of an orange bisected at the equator) 
and the triangles that fit together to form the faces of one of the regular solids (so the 
triangles are a decomposition of a regular solid into congruent triangles). 

If we set the digons aside, the only examples of congruent triangles with an angle 
sum greater than 7 and angles of the form 27/n for some integer n—the condition 
that n of them meet at each vertex—are these: 


e angles of 27/3, 27/3, and 27/3. 
e angles of 7/2, 7/2, and 7/2. 
e angles of 27/5, 27/5, and 27/5. 


These form the faces of the regular tetrahedron, octahedron, and icosahedron (see 
Fig. 15.2), respectively, as they appear on the sphere. 

Schwarz also noticed that when \ + j1 + v = 1 the triangles are Euclidean and 
fit together to cover the plane. And he gave one example when A+ uwt+y < 1: A= 
i, b= i v= 5 (Fig. 15.3), but he missed its true significance, and was to become 
very cross with Poincaré when he pointed it out. Poincaré’s entirely reasonable 
opinion, expressed to a mutual acquaintance, was that missing this discovery was 


Schwarz’s fault, and there was nothing that could be done about it. 


15.3. Exercises 


1. What nets of congruent triangles can you draw on the surface of a sphere? What 
are the angles at each vertex? 

2. What nets of congruent triangles can you draw on the plane? What are the angles 
at each vertex? 
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Fig. 15.3. The triangles with 
angles 7, 7, 5 cover a disc, 
Schwarz Gesammelte 
Mathematische 
Abhandlungen, vol. 2, 240 


3. What do you make of Schwarz’s net in Fig. 15.3? Look on the web for other 
figures of this kind. 


Questions 


1. We have seen that the map ¢(z) = z!/?, which maps the complex z-plane to the 
complex ¢-plane, is not too hard to understand, but even this simple case can be 
confusing. The set-theoretic inverse map z(¢) = ¢? is a much simpler 2 : 1 map. 
If you accept that a quotient of solutions to the hypergeometric equation maps 
the upper half-plane to a triangle, what does its set-theoretic inverse do? 

2. What properties does the set-theoretic inverse map have when considered as a 
map of the whole net of triangles? 


Chapter 16 ®) 
Complex Ordinary Differential [eae 
Equations: Poincaré 


16.1 Introduction 


Poincaré (Fig. 16.1) took up the hypergeometric equation in order to enter a prize 
competition, but it led him to his discovery of what he called Fuchsian functions and 
the role of non-Euclidean geometry in complex analysis, and it made his name as a 
mathematician of the first rank.! 


16.2 Poincaré and Linear Ordinary Differential Equations 


Henri Poincaré was born at Nancy on 29 April 1854, the son of a professor of medicine 
at the university there. Apparently, he had a happy childhood, and his mother, a 
very active and intelligent women, consistently encouraged him intellectually. His 
brilliance at mathematics became apparent in the final years at school and he entered 
the Ecole Polytechnique at the top of his class, despite a poor performance in drawing. 
He had a lifelong capacity to immerse himself completely in abstract thought; it was 
said of him that he thought all the time. Although he was extremely prolific, he 
seldom bothered to resort to pen and paper, he disliked taking notes and gave the 
impression of taking ideas in directly, and having a perfect memory for details of all 
kinds. When asked to solve a problem he could reply, it was said, with the swiftness 
of an arrow. 

In 1875, he graduated from the Ecole Polytechnique, only second because of 
another low mark in drawing, and proceeded to the Ecole des Mines. When he 
graduated from there he became a mining inspector, and had to write a report on a 
mining disaster in which 22 people were killed. In 1878, he presented his doctoral 
thesis to the faculty of Paris on the subject of partial differential equations. Darboux, 


'For more detail, see Gray [124], upon which this account is based. 
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Fig. 16.1 Jules Henri 
Poincaré (1854-1912) 


one of the examiners and an early supporter of Poincaré, said that the thesis contained 
enough ideas for several good theses, although some points in it still needed to be 
corrected or made precise; Poincaré never did this. By now he had decided on a 
career as a mathematician, and by December 1879 he was in charge of the analysis 
course at the Faculty of Sciences at Caen. 

That year the prize competition of the Académie des Sciences called for a contri- 
bution to the theory of differential equations, and on March 22 Poincaré submitted 
his essay ‘Mémoire sur les courbes définies par une équation différentielle’. In it he 
considered first-order non-linear differential equations 


dx  _— dy 
AG.9) ViGxuy) 


where X and Y are real polynomial functions of real variables x and y, and investi- 
gated the global properties of their solutions. But he withdrew the essay on 14 June 
1880, and the examiners never reported on it? 

Instead, he turned his attention to some work by the German mathematician 
Lazarus Fuchs, and submitted an essay on a topic connected with it on 28 May 
1880. Fuchs was a former student of Kummer’s who had then taken up Weierstrass’s 
complex function theory, and was particularly interested in linear differential equa- 
tions in the complex domain. In his major papers (1865) and (1866), he had shown 
how to generalise Riemann’s insights into the hypergeometric equation to linear 
ordinary differential equations of any order, and he had successfully characterised 
those equations all of whose solutions are holomorphic everywhere except for a finite 
number of points where the coefficients of the differential equation become infinite. 

In 1880, Fuchs had returned to the topic with a new question, and as part of this 
work he had considered a second-order linear differential equation with a basis of 


It did however lead to a series of papers initiating the subject of flows on surfaces, see, e.g. Poincaré 
[213, 214], and the great memoir on celestial mechanics [215]. 
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solutions f;(z) and f2(z) and investigated what happens to their quotient ¢(z) = A a 


under analytic continuation. It is clear that when z is taken on a loop that encloses a 
singular point, the quotient returns in the form 


ay fiz) + a2 fo(z) 
ar f\(Z) + ax fo(z)’ 


and so ¢ is a many-valued function of z. 

Poincaré read Fuchs’s work, but was not persuaded by it. He entered into a corre- 
spondence with Fuchs that is too technical to describe here. But it is clear that while 
Fuchs was deeply immersed in Weierstrass’s complex function theory, with its insis- 
tence on power series methods, which are essentially local in scope, Poincaré had 
immediately picked up on the global nature of the solutions to differential equations. 

Poincaré looked at the simplest cases, among them the hypergeometric equation, 
which has three singular points. Suppose one of the singular points is at the origin, 
and a basis of solutions is given by the functions z' f;(z) and z” f2(z), where f; 
and f2 are holomorphic and non-zero in a neighbourhood of the origin, then their 
quotient is z°!~? f; (z)/ f(z). The factor f;(z)/f2(z) is holomorphic and non-zero 
near the origin, so the behaviour of the quotient near the origin is governed by the 
exponent difference, 0; — (2. 

Poincaré wrote to Fuchs to say that if the exponent differences were 1, 02, and 
3 at infinity, then either 0; + 02 + ¢3 > 1, in which case z is rational in ¢, or 
(1 + 62 + e3 = 1, in which case z was doubly periodic. Even in this case there were 
difficulties, and Poincaré supplied an example to show that Fuchs’s theorem was still 
wrong. But, if a requirement that Fuchs had imposed on the differential equations 
he was considering was dropped, then the case p; + (2 + 03 < 1 could be included, 
which, Poincaré remarked, gives a “much greater class of equations than you have 
studied, but to which your conclusions apply. Unhappily my objection requires a 
more profound study, in that I can only treat two singular points”. However, z is still 
single-valued, and 


These functions I call Fuchsian, they solve differential equations with two singular points 
whenever 1, 02, and 93 are commensurable with each other. Fuchsian functions are very 
like elliptic functions, they are defined in a certain circle and are meromorphic inside it. 


On the other hand, he concluded, he knew nothing about what happened when 
there were more than two singular points. 

When Poincaré wrote again to him again on the 30 July his own researches on the 
new functions he led him to discover that they 


present the greatest analogy with elliptic functions, and can be represented as the quotient 
of two infinite series in infinitely many ways. Amongst those series are those which are 
entire series [which] converge in a certain circle and do not exist outside it, as thus does the 
Fuchsian function itself. 


3See Gray [124, 125]. 
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and he went on to explain how the new functions were the solutions of an extensive 
class of differential equations. 

Before we turn to discuss what Fuchsian functions are, note that they only illu- 
minate the study of a differential equation if they can be defined independently of 
the equation. This Poincaré did by introducing Fuchsian and theta-Fuchsian series. 
Moreover, by calling these functions Fuchsian and not Schwarzian Poincaré was 
showing that he had not read Schwarz’s paper. He was to be much criticised by Klein 
for this, but he refused to back down, and he only ended the quarrel by naming a 
related but new class of functions Kleinian, even though Klein had not had much to 
say about them. 

In the essay Poincaré submitted to the competition, he considered when the quo- 


tient € = ¢(z) of two independent solutions of the differential equation a + Qy= 
0 defines, by inversion, a meromorphic function z of ¢—note that y, z, and ¢ are 
all complex variables. He showed that Fuchs’s conditions were not necessary and 
sufficient. Rather, for z to be meromorphic on some domain it was necessary and 
sufficient that the exponent differences at each singular point, including infinity, dif- 
fer by an aliquot part of unity (i.e. p) — po = I, for some positive integer 7). If the 
domain is to be the whole complex sphere then this condition is still necessary, but it 
is no longer sufficient. He found that there were too many special cases for Fuchs’s 
methods to work easily, and so he proposed to take a new approach, beginning with 
Fuchs’s example of a differential equation in which there are two finite singular 
points a, and a, where the exponent differences are 1/3 and 1/6, and the exponent 
difference at oo is 1/2. 
In this case, he found the change in z was of the form 


f 

z-a Z-a 
Z Z, = e2t/3 

=p op 


under analytic continuation around aj, and 


ar ey 
! = I/3 
Zi Zs =e —~ 

zZ— 35 ( z—6 ) 
under analytic continuation around az, and + a —1 around oo. Note that the _ 
two of these maps keep the points a and £ feed and are otherwise like rotations.* 

Accordingly, z is ameromorphic single-valued function of ¢ mapping a parallelo- 
gram composed of eight equilateral triangles onto the complex sphere, and £ = oo is 


its only singular point, so z is an elliptic function. The differential equation, Poincaré 
showed, has in fact an algebraic solution 


y= —a))'F — my? 


and a non-algebraic solution yz such that 


4For information about coaxial circles, see Appendix F. 
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Fig. 16.2 A quadrilateral in 
the unit disc (Poincaré, 
Oeuvres 1, 365) 


Pe Jo _ ay) P(x _ ar) ~>/*dx. 
JI 


This result agrees with Fuchs’s theory. 

However, it might be that the domain of ¢ could not be the whole ¢ sphere. 
Poincaré showed that this could happen even when the differential equation had only 
two finite singular points. For example, if the exponent differences were a 5, and 7 
at oo, then as long as z crosses no cuts ¢ stays within a quadrilateral aga’y (see his 
Fig. 2, p. 86, given as Fig. 16.2). 

Furthermore, however z is conducted about in its plane, ¢ cannot escape the 
circle HH’. Poincaré described the quadrilateral as “mixtiligne”, the circular-arc 
sides meet the circle HH’ at right angles. This geometric picture is quite general, 
curvilinear polygons are obtained with non-re-entrant angles and circular-arc sides 
orthogonal to the boundary circle. Thus, the domain of x is |¢| < OH, and Poincaré 
then investigated whether z is meromorphic. This reduces to showing that, as ¢ is 
continued analytically, the polygons do not overlap. This does not occur if the angles 
satisfy conditions derived from Fuchs’s theory, unless the overlap is in the form of 
an annular region. 

However, if the angles are not re-entrant, this cannot happen, and so z is mero- 
morphic. Poincaré’s proof of this is of incidental interest. He projected the circle 
H H' stereographically onto the southern hemisphere of a sphere, and then projected 
the image orthogonally back onto its original plane. The circular arcs orthogonal to 
HH’ become straight lines, which renders the theorem trivial. This result virtually 
concluded Poincaré ’s essay. As he said in his letter to Fuchs, his understanding was 
limited essentially to the case of two finite singular points. 

Poincaré also wrote three anonymous supplements to the essay bearing the motto 
“Non inultus premor” that were received by the Académie on 28 June, 6 September, 
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and 20 December.° We shall look at them shortly for the surprising connection they 
make between the theory of linear differential equations and non-Euclidean geometry. 

In due course, Poincaré’s essay was awarded second prize.° Hermite, one of the 
judges, said of the essay that’: 


the author successively treated two entirely different questions, of which he made a profound 
study with a talent by which the commission was greatly struck. The second ...concerns the 
beautiful and important researches of M. Fuchs, .... The results ...presented some lacunas 
in certain cases that the author has recognized and drawn attention to in thus completing 
an extremely interesting analytic theory. This theory has suggested to him the origin of 
transcendents, including in particular elliptic functions, and which has permitted him to 
obtain the solutions to linear equations of the second order in some very general cases. 
A fertile path is there that the author has not entirely gone down, but which manifests an 
inventive and profound spirit. The commission can only urge him to follow up his researches 
in drawing to the attention of the Academy the excellent talent of which they give proof. 


16.3 Poincaré’s Breakthrough and Non-Euclidean 
Geometry 


Poincaré himself has left us one of the most justly celebrated accounts of the pro- 
cess of mathematical discovery, which concerns exactly his route to the theory of 
Fuchsian functions.® Poincaré gave this account in a lecture he gave to the Société de 
Psychologie in Paris 1908, and it was later published as the third essay in his volume 
Science et Méthode [221], with the title “L’invention mathématique” (Mathematical 
discovery). 

Poincaré began by doubting that Fuchsian functions could exist, but shortly came 
to the opposite view. He tells us in the lecture ((221], p. 50) that: 


For two weeks I tried to prove that no function could exist analogous to those I have since 
called the Fuchsian functions: I was then totally ignorant. Every day I sat down at my desk 
and spent an hour or two there: I tried a great number of combinations and never arrived 
at any result. One evening I took a cup of coffee, contrary to my habit; I could not get to 
sleep, the ideas surged up in a crowd, I felt them bump against one another, until two of them 
hooked onto one another, as one might say, to form a stable combination. In the morning I 
had established the existence of a class of Fuchsian functions those which are derived from 
the hypergeometric series. I had only to write up the results, which just took me a few hours. 


5This is the motto of Poincaré’s home town of Nancy; it means “No-one touches me with impunity”. 
The supplements are to be found in the Poincaré dossier in the Académie des Sciences, but for 
whatever reason Norlund did not publish them when he published the essay in Acta Mathematica, 
and they were not included in Poincaré ’s Oeuvres. The supplements confirm and greatly amplify 
what Poincaré said in the lecture 28 years later, and have since been published as Poincaré [225]. 
®It was ranked behind one by Halphen, and was not published until Nérlund edited it for Acta 
Mathematica Vol. 39, 1923, 58-93, in Oeuvres, I, 578-613. 


7Quoted in Poincaré, Oeuvres, II, 73. 
8For more detail, see Poincaré [225]. 
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Then he had to find an independent description of the new functions, so that it is a 
meaningful remark that they solve certain differential equations. He went on ([221], 
51): 


I then wanted to represent the functions as a quotient of two series; this idea was perfectly 
conscious and deliberate; the analogy with elliptic functions guided me. I asked myself what 
must be the property of these series, if they exist, and came without difficulty to construct 
the series that I called theta-fuchsian. 


Next, he said that ({221], 51-52) 


At that moment I left Caen where I then lived, to take part in a geological expedition organized 
by the Ecole des Mines. The circumstances of the journey made me forget my mathematical 
work; arrived at Coutances we boarded an omnibus for I don’t know what journey. At the 
moment when I put my foot on the step the idea came to me, without anything in my previous 
thoughts having prepared me for it; that the transformations I had made use of to define the 
Fuchsian functions were identical with those of non-Euclidian geometry. I did not verify 
this, I did not have the time for it, since scarcely had I sat down in the bus than I resumed 
the conversation already begun, but I was entirely certain at once. On returning to Caen I 
verified the result at leisure to salve my conscience. 


The supplements go beyond the essay precisely in their use of non-Euclidean 
geometry. In the first supplement, Poincaré began by reviewing the tessellation of 
the disc by “mixtiligne” quadrilaterals obtained by successively operating on one, 
which he called Q, by transformations M and N. He observed (p. 9) that these 
transformations form a group, and remarked: 


There are close connections with the above considerations and the non-Euclidean geometry 
of Lobachevskii. In fact, what is a geometry? It is the study of a group of operations formed by 
the displacements one can apply to a figure without deforming it. In Euclidean geometry the 
group reduces to rotations and translations. In the pseudogeometry of Lobachevskii it is more 
complicated. Indeed, the group of operations formed by means of M and N is isomorphic 
(‘isomorphe’ ) to a group contained in the pseudogeometric group. To study the group formed 
by means of M and N is therefore to do the geometry of Lobachevskii. Pseudogeometry 
will consequently provide us with a convenient language for expressing what we will have 
to say about this group. (Poincaré ’s emphasis.) 


Poincaré’s realisation on boarding the bus at Coutances can be described very sim- 
ply. He realised that the straightened version of the “mixtiligne” figures described 
at the end of his Prize essay was identical with the figures in Beltrami’s descrip- 
tion of non-Euclidean geometry [5, 6]; that, therefore, the original figures were 
conformally accurate representations of non-Euclidean figures; and finally that this 
meant the transformations formed from M and N were non-Euclidean isometries. 
Beltrami’s detailed discussion of the non-Euclidean differential geometry of the disc 
enabled Poincaré to give anew meaning to his previously analytical transformations. 
Consequently, on p. 20, he remarked that: 


The Fuchsian functions are to the geometry of Lobachevskii what the doubly periodic func- 
tions are to that of Euclid. 


Poincaré remained stuck on the case of the hypergeometric equation at least until 
his fourth letter, 30 July. Liberation came from an unexpected source, arithmetic 
((221], 52, 53): 
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I then undertook to study some arithmetical questions without any great result appearing 
and without expecting that this could have the least connection with my previous researches. 
Disgusted with my lack of success, I went to spend some days at the sea-side and thought 
of quite different things. One day, walking along the cliff, the idea came to me, always 
with the same characteristics of brevity, suddenness, and immediate certainly, that the arith- 
metical transformations of ternary indefinite quadratic forms were identical with those of 
non-Euclidean geometry. 


Once back at Caen I reflected on this result and drew consequences from it; the example 
of quadratic forms showed me that there were Fuchsian groups other than those which 
correspond to the hypergeometric series; I saw that I could apply them to the theory of theta- 
Fuchsian series, and that, as a consequence, there were Fuchsian functions other than those, 
which derived from the hypergeometric series, the only ones I knew at that time. I naturally 
proposed to construct all these functions; I laid siege systematically and carried off one after 
another all the works begun; there was one however, which still held out and as the chase 
became involved it took pride of place. But all my efforts only served to make me know the 
difficulty better, which was already something. All this work was quite conscious. 


The second supplement is given over to a more rigorous description of non- 
Euclidean geometry, and to tessellations of the disc by polygons with angles 1/m 
for integers m. When the polygon is a triangle, he also discussed more carefully the 
ways of constructing Fuchsian functions in this case and was led to conjecture a result 
which he said he was not yet in any state to prove—the Riemann mapping theorem! 
Then, on p. 17, he abruptly stated the connection with the theory of quadratic forms, 
a subject upon which Hermite was an expert. 

He let T be a matrix (“substitution”) with integer coefficients which preserved an 
indefinite ternary quadratic form ® and S be a substitution sending ¢* + n? — &? to 
®. Then ST S~! preserves £7 + ? — &? and sends (¢, 7, €) to (¢’, n’, €)’, say. 


The quantities 
me buns owl! P2e tas 
GT 


are related by transformation z’ = zK of the non-Euclidean plane provided ¢? + 
n° — &? <0. He did not prove that a sheet of the hyperboloid of two sheets pro- 
vides a model of non-Euclidean geometry—which is easy enough to establish—and 
remarked only that (p. 19): “All the points zK are the vertices of a polygonal net 
obtained by decomposing the pseudogeometric plane into polygons pseudogeomet- 
rically equal to each other”. 

The third supplement, of only 12 pages, was received on 20 December. Its main 
result is the extension of the method of polygonal decomposition to include cases 
where the angles are zero, and the roots of the indicial equation differ by integers. The 
notable example is Legendre’s equation. Poincaré ’s method is to push the polygons 
outwards until one or more vertices are “at infinity”, i.e. are on the boundary of the 
disc, and the corresponding angles vanish. Since the polygons hitherto studied had 
angles which were only rational multiples of 2, Poincaré ’s argument relies heavily 
on its geometrical plausibility. 

The unpublished work makes abundantly evident the astounding clarity of 
Poincaré ’s mind, coupled to an almost equally dramatic ignorance of contemporary 
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Fig. 16.3 A tessellation of 
the non-Euclidean disc by 
triangles. (Klein, 
Gesammelte mathematische 
Abhandlungen, vol. I, 

p. 126) 


mathematics. There is no mention of the work of Schwarz on the hypergeometric 
equation, nor is there any mention of the work of Dedekind or Klein, and even Her- 
mite’s work on modular functions, which he must have known, seems to have been 
forgotten. In fact, these omissions are not mere oversights; Poincaré genuinely did 
not then know the German work. 

It marks two things: Poincaré’s arrival on the mathematical scene, and the recogni- 
tion of non-Euclidean geometry as an important tool in mathematics and not merely 
as an unexpected feature of geometry. 


16.4 Non-Euclidean Geometry 


For us to be able to say that all the triangles in Fig. 16.3 are congruent, we have to 
define a sense of distance and a group of distance-preserving transformations such 
that corresponding sides have the same lengths and corresponding angles are equal. 
The convenient thing about this kind of picture of non-Euclidean geometry is that it 
represents angles correctly. This requires us to use angle-preserving transformations 
of the disc to establish congruences, so we invoke Mobius transformations. But what 
about lengths? 

To investigate a formula for distance that is invariant under such transformations, 
we begin by looking at maps of the above kind that also map the real axis to itself. 
They are Mobius transformations of the form 


zt+r 
z+1 


Zh =pu(z), reR. 
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We want a formula for the distance between two points on the real line with the 
property that the distance from 0 to a is the same as the distance from j4(0) = r to 


La) = ae . If we write d for distance, then we want d to be such that 


r+a 
d(0, a) = d(w(0), w(a)) = d (~ **) : 


We notice that if 7 = tanh p and a = tanha then the formula we want to say that 
d(0, tanha) = d(tanh p, tanh(o + a@)). 


This will be the case if we define d by the formula 
ih (=) | 
1 — 222 


d(tanh p, tanh(o + a@)) = tanh”! tanh(op +a —a)= tanh”! tanha = a, 


d(z1, 22) = tanh”! 


for then d(0, tanha) = a@ and 


as required. 
If we represent Euclidean distances from the centre by r and non-Euclidean dis- 
tances from the centre by p, then the above formula says that 


p= tanh7!(r), or r = tanh p, 


so 
dr = sech’ pdp =(1- tanh? p)do 


and therefore 
eee dr 
ene PETS 
This makes good sense. As a point moves outwards its distance from the centre tends 
to 1 and so 1 — r? tends to zero. Therefore, equal non-Euclidean steps of dp are 
represented by steadily smaller Euclidean steps of dr. 
It can now be proved that in non-Euclidean geometry: 


geodesics appear as arcs of circles perpendicular to the boundary circle; 
circles appear as circles; 

the angle sum of a triangle is less than 7; 

the area of a triangle with angles a, B, y is proportional to m —(a+f8+y); 
there are many parallels to a given line through a point not on that line. 


Theorems like these gave the new geometry its name, and occasioned much debate 
in the nineteenth century about its logical coherence and its physical applicability. 
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With the profusion of new geometries in the twentieth century it gradually lost its 
position in the panoply of geometry, and is now usually known by the name Klein 
gave it: hyperbolic geometry. 


16.4.1 Summary 


Whether the web of triangles is on the sphere, on the plane, or on the non-Euclidean 
disc, the triangles in each case are mutually congruent. There is a small number of 
distinct spherical triangles, a small number of distinct Euclidean ones, and an infinite 
number of distinct non-Euclidean ones that can be used. In each case, the web can be 
mapped to itself by a group of isometries, and there are functions F on the sphere, 
plane, or non-Euclidean disc with the property that if g is amember of the appropriate 
group of isometries then F(gz) = F(z). 

This generalises the periodic behaviour of the trigonometric functions. For exam- 
ple, sin(x + 2kzr) = sin(x). In this case, the sine function is defined on the real line 
and the group of integers acts as follows: k € Z sends x € R to x + 2kz. 

This was not exactly how it was seen before Poincaré. Schwarz’s discovery of 
the connection between the hypergeometric equation and the regular solids was 
new. The case of elliptic functions was well known, but the double periodicity of 
these functions was not seen as related to an action of the group Z@Z on the 
complex plane. Poincaré’s introduction of non-Euclidean geometry was wholly new 
and surprising. 


16.5 Exercises 


1. Show that a MObius transformation that maps the unit disc to itself and fixes the 
points z = —1, 0, and | is the identity map. 

2. Call the arc of a circle that lies inside the unit disc and meets the unit circle at 
right angles (and also any diameter of the unit circle) a d-line. Show that a Mobius 
transformation that maps the unit disc to itself maps d-lines to d-lines. 

3. Show that any Mobius transformation that maps the unit disc to itself cannot map 
a segment of a d-line to a proper subset of itself. 


Questions 


1. Explain why the above exercises show that it is possible to speak of the length of 
a segment of a d-line. Why did Poincaré regard the presence of a group of Mobius 
transformations that map the unit disc to itself as almost synonymous with the 
existence of a (non-Euclidean) geometry in the disc? 

2. Find what you can about what is called the Kleinian view of geometry. 


Chapter 17 M®) 
More General Partial Differential iets 
Equations 


17.1 Introduction 


As clarity grew about the existence of solutions for linear ordinary differential equa- 
tions, and the existence of solution methods for them advanced, it became clear to 
mathematicians that the corresponding story for partial differential equations was 
woefully underdeveloped. The first to prove any kind of general existence theorem 
for partial differential equations was Cauchy, who in 1819 successfully treated the 
general first-order partial differential equation. His work on initial conditions estab- 
lished the framework that later became known as the Cauchy problem. Ampére also 
did important work on the subject at the same time in his [2]. Cauchy then returned to 
the topic in a paper of 1842 and gave an argument to show that a partial differential 
equation of any order defined by one or more analytic equations has an analytic 
solution in a neighbourhood of a suitably chosen initial curve (or, if the equation has 
more than two independent variables, a hypersurface). This is the origin of the idea 
of the Cauchy hypersurface condition for hyperbolic partial differential equations, 
but Cauchy’s account left much for later mathematicians to do. 

Cauchy’s ideas were rediscovered and extended by Sonya Kovalevskaya, who 
also documented unexpected issues with initial conditions, and how it can fail, and 
in the 1870s Darboux further improved the analysis. 

Meanwhile, in the 1860s, Riemann had given a clear example of how the method 
of characteristics can show that solutions will cease to exist, and applied this in the 
study of the propagation of sound to show how shock waves can form. His paper, 
which is discussed in Chap. 22, also contains innovative ideas about the use of Green’s 
function methods in the study of hyperbolic partial differential equations. 
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17.2 Cauchy’s Method in 1819 


Cauchy was a student at the Ecole Polytechnique from 1805 to 1807, where he 
was taught analysis by Lacroix, whose Traité Elementaire de Calcul Différentiel et 
Intégral was required reading.' The account it gives of partial differential equations 
was by then fairly standard, and owed much to Monge’s approach—as one might 
expect at the Ecole Polytechnique. The equation 


Pp+Qq=R, 


fora function z(x, y), where P, Q, and R are functions of x, y, and z, p = aS andg = 
7, Was solved by eliminating p from the equation and the identity dz = pdx + qdy 
to obtain 


Pdz— Rdx = q(Pdy — Qdx). 


Lacroix then distinguished the simpler case, where the differential Pdz — Rdx 
only involves x and z and the differential Pdy — Qdx only involves x and y, from 
the general case. The simple case is solved by the method of integrating factors, 
but in the general case the method fails, although ad hoc changes of variable can 
sometimes help, as Lacroix proceeded to show. Monge’s geometric approach was 
relegated to a footnote in Sect. 348, where it was described as “very ingenious”. 

Monge’s geometric method was redescribed by Cauchy, who explained it at length 
for equations in two variables and then showed how to overcome the problems 
of extending it to any number of variables. You can read Cauchy’s paper [34] in 
Sect. 31.1; it makes an instructive comparison with that of Monge.” It is likely, given 
Cauchy’s growing appetite for mathematics, that he read Monge’s account. 

Neither Monge nor Cauchy specified what conditions on the function defining the 
partial differential equation are necessary for their proofs to work, but it is likely that 
Monge assumed that everything is analytic in something like the sense that every 
function admits a power series expansion, and that Cauchy assumed that functions 
were no more differentiable than necessary. That would put his paper of 1819 on a 
par with his paper a couple of years later on ordinary differential equations and with 
his introduction of epsilon-delta analysis at the Ecole Polytechnique in 1821. That 
said, as was typical in Cauchy’s work, he let conditions on the function f emerge in 
the course of the proof. In fact, although his [34] is an existence proof Cauchy never 
used the term “exist” and never stipulated what hypotheses on f he used, namely, 
that f be continuously differentiable. 

One of the assessment questions on this part of the course is to give an account 
of Cauchy’s proof in his paper [34] that first-order partial differential equations 
have solutions; there is a translation of the paper in Chap. 31. It would therefore be 
inappropriate to give a detailed explanation of it here, but we can note that it opens 


'The Ecole Polytechnique had just been reorganised by Napoleon as a military school; Cauchy 
entered third in the ranking of the 125 entrants. 


>The paper is reprinted in his Oeuvres series 2, volume 2, 1958. 
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with a clever change of variables argument that greatly simplifies the equation to be 
solved, then there is an investigation of a necessary condition, then a quick, analytical 
derivation of the equations that Monge had exhibited a decade before, and then an 
investigation of the initial conditions. 


17.3. Cauchy and the General Partial Differential Equation 


We must also be sketchy in our account of Cauchy’s papers of 1842 on the existence 
of solutions to the general partial differential equation for two reasons: it is a much 
more difficult paper, and what Cauchy provided is itself little more than an outline 
at crucial points. 

He had already stated his aims in the paper (Cauchy [36]) (I take this translation 
from Cooke ([47], 25): 


In the theory of equations mathematicians have properly considered fundamental the question 
whether every equation has a root. Similarly in the integral calculus one of the most important 
questions, a fundamental question, is obviously whether every ordinary or partial differential 
equation can be integrated. But — and this ought to surprise us at first sight — despite the 
numerous works of mathematicians on the integral calculus, this question, important though 
it be, is nowhere solved in full generality. To be sure the existence of general integrals 
of ordinary differential equations, which contain only one independent variable, is now 
established by two different methods which I have given, the first in my lectures at the Ecole 
Polytechnique, the second in a lithographed memoir of 1835. .... In addition, the existence 
of general integrals of partial differential equations is established in certain cases where 
one is able to integrate these equations, for example when the equation reduces either to a 
single equation of first order or to linear equations in which the coefficients of the unknowns 
and of their derivatives remain constant. But does an arbitrary system of ordinary or partial 
differential equations always admit a corresponding system of general integrals? Such is 
the problem which seemed to me worthy of the attention of mathematicians. The present 
solution is based on considerations which I shall explain briefly. 


For a long time mathematicians, supposing without proof that every ordinary or partial 
differential equation admits a general integral, have considered Taylor’s formula as the means 
of developing this integral in a series of increasing integer powers of an increment i given 
to an independent variable t, which can be considered as representing time. Further, using a 
theorem which I proved in 1831 relating to the development of functions, one can be sure that 
in the case where the series so obtained is convergent, the sum of the series satisfies, as an 
integral, the ordinary or partial differential equation, at least for real or complex values of the 
increment i whose moduli do not exceed a fixed bound. Moreover the same remark applies 
to the sums of the series obtained when, assuming the existence of general integrals of a 
system of ordinary or partial differential equations, one sets about developing them in Taylor 
series. But in all cases it remains to be proved that the series so obtained is convergent, at 
least for i of sufficiently small modulus. Now this end can be achieved using a fundamental 
theorem which not only determines a bound beneath which the modulus of i may vary 
arbitrarily without causing the series obtained to diverge, but also determines a bound on the 
error caused by terminating each series after a certain number of terms. The proof of this 
theorem is based, as will subsequently be seen, on the principles of the new calculus which I 
have called “calcul des limites” and on a device of analysis which can be given many useful 
applications. 
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Cauchy then went on to develop his “calcul des limites”, which we call the method 
of majorants, for determining if a series converges by comparison with another series 
that has larger terms but does converge. 

An application of these ideas to the theory of partial differential equations came in 
his [37]. He began by claiming that any partial differential equation can be reduced 
to a system of first-order (and what we would call) quasi-linear partial differential 
equations by introducing more unknown functions, and then said that therefore it 
was enough to show how to solve a system of such first-order partial differential 
equations. 

In his [37] gave a careful argument to show that a single such equation can be 
solved if the equation has analytic coefficients and certain analytic initial data is 
given, and in his [37] he dealt with a system of such equations. 

In his [37], Cauchy set himself the task of showing that a partial differential 
equation of the form 


Uy = AUy, + AQuy, + +++ + ayy, +U (17.1) 


for an unknown function u has a solution, where a), a2,..., a, and v are analytic 
functions of the independent variables x,,x2,...x,, and ft, and the value of u is 
prescribed in a neighbourhood of a point in which t = t, a constant. Thus, the initial 
data is the value of u(x), x2, ..., Xn, T) and its partial derivatives with respect to the 
other variables x), x2, ..., Xn, namely, uy,(%1,X2,---,Xn,T), f=l...n. 

He now investigated the consequences of assuming the partial differential equation 
has a solution that is a power series in powers of t — tT when ¢ equals some value T. 
This means that near t = tT the solution uw is a function w of the form 


w= 1+¢—2)+ 60 — 7) 425+, (17.2) 
The coefficient J, is given by 
= oe 
n— nl? 


where D7? is the nth derivative of w with respect to ¢ evaluated at the point t = t. His 
task was now to show that this series converges for suitably small absolute values of 
t—T: 

Cauchy interpreted the partial differential equation as saying that 


D; = a, D,, a anDy,, peers anD,, +u 


and so the coefficients /,, are expressions in various Dy, acting on various a,. His 
question now was how to estimate them and obtain the convergence result that he 
wanted. 

He let the variables vary by small amounts and considered the maximum effect 
their variation has on the coefficients a), d2,...,@, and v. He then observed that 
this effect is produced by a particularly simple partial differential equation and so a 
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study of this partial differential equation could be used to study Eq. (17.1). This is 
the equation of the same form as Eq. (17.1) in which 


= ae 
Gj(X1,X2,-.-,Xn,t, W) = Oj (X1x2...xX,tw), f=l,...,n 


in which a1, a, ...@, are constants. 

This is an equation of the kind that his paper of 1819 applies to, and it can be 
solved by passing to a system of ordinary differential equations. These equations 
have a solution for some non-zero values of the variables. This solution dominates 
the conjectured power series solution (17.2), and so it converges and, as Cauchy 
checked, it defines a solution of the original partial differential equation (17.1) ina 
neighbourhood of the given system of initial values. 

As Cooke ([47], 27) remarked, Cauchy did not discuss the uniqueness of the 
solution, when it exists. He seems to have assumed that the solution is determined by 
the initial values. But these papers form what is regarded as Cauchy’s contribution 
to the Cauchy—Kovalevskaya theorem, so historical generosity seems to have been 
at work. 


17.4 Kovalevskaya’s Theorem and Her Counter-Example 


In 1875, the Russian mathematician Sonya Kovalevskaya (Fig. 17.1) published a 
paper in the Berlin Journal fiir die reine und angewandte Mathematik [165] that 
conveyed the results she had written the year before as a private student of Weier- 
strass’s in Berlin, and would undoubtedly have led to the award of a Ph.D. at the 
University of Berlin had women been eligible to study for degrees there at all. But 
they were not, and so Weierstrass persuaded one of his former students, Lazarus 
Fuchs, by then a professor at Géttingen, to see that she was awarded a Ph.D. there.* 

In this paper, Kovalevskaya gave a new proof of Cauchy’s theorem on the existence 
of solutions to a first-order quasi-linear partial differential equation in the analytic 
case. She says that she had learned this result from Weierstrass’s lectures, and it seems 
that neither of them knew of Cauchy’s much earlier proof of the same result. Her 
ignorance is understandable, although she does mention work by Briot and Bouquet 
[23], who do cite Cauchy, but it is striking that Weierstrass could claim that he did 
not. 

She also indicated how the theorem might be extended to systems of partial dif- 
ferential equations, and therefore to partial differential equations defined by a poly- 
nomial equation in 7 variables and the partial derivatives of the unknown function. 
She showed that there is always a convergent power series in the variables about a 
point (a1, d2,..., d,) that satisfies the equation at that point, and that if such a func- 
tion satisfies the partial differential equation then the coefficients of its Taylor series 
expansion can be determined from the partial differential equation. This theorem, 


3German universities were admirably relaxed about where people had studied for their degree. 
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Fig. 17.1 Sonya 
Kovalevskaya (1850-1891). 
Acta Mathematica, Table 
générale des tomes 1-3, 
1882-1912, p. 153 


which was independent of Cauchy’s, is her contribution to the Cauchy—Kovalevskaya 
theorem. 

She then went on to surprise her supervisor with a novel, and indeed disturbing, 
observation about initial conditions for partial differential equations. Her example 
({165], 22) was all the more disturbing because it concerned the one-dimensional 
heat equation, which one might have thought was well understood. The equation is 


uy = Uxx, 


and everything could be expected to be well behaved if u(xo, t) and u, (xo, t) are given 
as analytic functions of t. Kovalevskaya took as initial conditions the requirement 
that u(x, 0) = (x — 1)~!, and observed that the partial differential equation u, = ux 
is formally solved by the infinite series 


iar ti di 
> (i): (Gat): 


j=0 


which reduces to the function f(x) = (x — 1)~! when t = 0. However, the power 
series solution diverges for all t 4 0. 

It might be objected that the boundary condition involves a function that becomes 
infinite when x = 1. Could this be the reason that the solutions diverge? It is not, 
for, as Cooke ([47], 33) observes, one can conduct a similar analysis when f(x) = 
(1 + x?)7!. In this case, the formal solution to the partial differential equation is 
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(oe) 
(2m + 2n)! 5 
“of = 1 m+n myn 
BGA) ox Omni 
When t = 0 this reduces to the series 
oe) 
HO. 0S Yas 
m=0 


which is indeed the power series for (1 + x?)—| aseries that is never infinite (provided 
x is real, which the partial differential equation surely requires). However, when 
x = 0 the series reduces to 


~ 2n)! 
10,9 = oe, 


n=0 


which diverges for all t > 0. 
Cooke goes on to quote a solution in the form of a Fourier integral: 


oe) 
u(x,t) 2} e”" cos(xy)dy, 
0 


which is analytic only if t > 0. 

That year, 1875, the French were also caught ill-informed. The young Gaston 
Darboux published a four-page paper in the Comptes Rendus For January 1875 in 
which he rederived Cauchy’s result from 1835 on the solution of ordinary differential 
equations and extended it to partial differential equations, noting that Briot and 
Bouquet had given a new proof and explored the consequences of the theorem. He 
also noted that one still lacked a perfectly general theory of equations of this kind, and 
promised a subsequent paper in which he would explain the theory of characteristics, 
which he attributed to Monge. 

Darboux’s argument was not that different from Cauchy’s: first show that there 
is a formal power series that satisfies the equation and the given initial conditions; 
second show that for a certain range of the variables this series converges. 

Within the month the Italian mathematician Angelo Genocchi had written in with 
“some observations”. He admired Darboux’s talent, but he noted that Cauchy had 
written about the problem for systems of partial differential equations in a series of 
papers in the Comptes Rendus for 1842, so Cauchy deserved the credit for the first 
proof. Then in 1873 the French mathematician Puiseux had made some important 
remarks about implicit functions that an Italian mathematician called Félix Chio had 
amplified, also in the Comptes Rendus. Genocchi added that Cauchy also deserved 
credit for the theory of higher dimensional spaces “about which there is so much 
noise at present’, and that there was also the delicate point discussed by German 
mathematicians of what they called “convergence in equal degree” (and we call 
uniform convergence today). 
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Genocchi’s note was published, and the perpetual secretary of the Académie des 
Sciences, Joseph Bertrand, took the opportunity to press for the prompt publication 
of the Oeuvres of Cauchy. The opportunity was also given to Darboux to further 
develop his method. 

All this burst of activity came as a surprise to Weierstrass, as Cooke relates. It 
seems that Weierstrass had not renewed his subscription to the Comptes Rendus on 
time, and so only got the relevant copies of it some time after they had appeared in 
early 1875. He quickly wrote to Kovalevskaya to tell her what was going on Cooke 
(({47], 35): 

So you see, my dear, that this question is one which is awaiting an answer, and I am very 


glad that my student was able to anticipate her rivals in time and at least not fall behind them 
in working out the problem. 


Darboux mentions several exceptional cases which are of special interest; I am inclined to 


think that he also has encountered the difficulties (as in the equation ud — a¢ 
you so much trouble at first and which you later overcame so successfully. 


) which gave 


He also sent a copy of her dissertation to Hermite, Darboux’s mentor, and it so 
impressed both men that they became staunch advocates for her later on. This surely 
contributed to the high opinion that Poincaré was to have of her work. In the course of 
his prize-winning paper on celestial mechanics that made his name internationally, 
he wrote (Poincaré [215], Sect. 3): “Mme Kovalevski has considerably simplified 
Cauchy’s demonstration and has given the theorem its definitive form”. 

It might seem that this means only that Kovalevskaya improved on Cauchy’s 
method of proof, and indicated that it can break down. In fact, twentieth-century 
mathematicians found that her theorem pointed to a number of subtle developments 
of which one is worth mentioning here precisely because it was not sufficiently 
appreciated at the times: boundary conditions matter greatly when solving partial 
differential equations. We shall return to this point in later chapters. 


17.5 Exercises 


The central points of this lecture are to establish that Cauchy opened up the study 
of general partial differential equations (of order greater than one) but only in the 
analytic case, and that by a more careful analysis Kovalevskaya was able to show 
that his method of finding a transversal hypersurface did not always work. 

It seems to me that these points would be obscured by working through mathe- 
matical examples, so none are provided, except for this one: 


1. Find examples of power series with a zero radius of convergence. 


Questions 


1. What does the reception of Cauchy’s ideas about partial differential equations tell 
us about how mathematical ideas circulated in the mid-nineteenth century? 


Chapter 18 ®) 
Green’s Functions and Dirichlet’s pectics 
Principle 


18.1 Introduction 


Green introduced the functions that have come to bear his name in an attempt to 
solve problems in potential theory. Here we shall see how he used them, and how 
Dirichlet and Riemann used them to study Laplace’s equation. In this chapter, more 
than is often the case when dealing with mathematics as it was discovered, the results 
are imprecise and quite some effort from later mathematicians was needed to make 
them rigorous. 


18.2 Green’s Theorems and Green’s Functions 


Green introduced these functions in his famous Essay [128], where he claimed that 
given the value of a function V on a closed surface o there is a unique continuous 
extension to a function V defined on the interior which satisfies the Laplace equation 
and has no singular values inside the surface. This is, of course, the Dirichlet problem 
(before Dirichlet). 

He defended this claim as follows (I have somewhat modified his language). He 
supposed that there is a function U that is a harmonic except at the point P, where 
it becomes infinite like 1/r near the origin and is zero on the surface o. Then it 
followed from a general theorem of his that 


1 dU 
V(P)=— | V—. 
4x J, dn 


where the expression wv denotes the normal derivative of the function U. As he 


remarked, this shows that the value of V at P is known when its values are known 
on the surface. 
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Note, however, that Green’s function U depends on a parameter that locates the 
singular point. If you can determine a Green’s function for a given region that has its 
singular point at an arbitrary point P then this formula does indeed define a harmonic 
function in the region—but it is Green’s function that varies, not the values of a single 
Green’s function. 

But why does such a function exist? Green answered this question this way ([116], 
32): 


To convince ourselves that there does exist such a function as we have supposed U to be; 
conceive the surface to be a perfect conductor put in communication with the earth, and a 
unit of positive electricity to be concentrated in the point P, then the total potential function 
arising from P and from the electricity it will induce upon the surface, will be the required 
value of U. For, in consequence of the communication established between the conducting 
surface and the earth, the total potential function at this surface must be constant, and equal to 
that of the earth itself i. e. to zero (seeing that in this state they form but one conducting body). 
Taking, therefore, this total potential function for U, we have evidently 0 = U, 0 = V(U), 
and U = 1/r for those parts infinitely near to P. As moreover, this function has no other 
singular points within the surface, it evidently possesses all the properties assigned to U in 
the preceding proof. 


This argument is an appeal to physics pure and simple. Nonetheless, the intro- 
duction of a function of a particular kind that solves what was to become called 
the Dirichlet problem was to become a dominant idea in work on potential theory. 
Such functions are today called Green’s functions. Their use derives from what has 
become known as Green’s identity. 

Let us introduce the notation VU of a function U = U(x, y, z) to mean 


VU = (U,, Uy, U.), 


and 
V7U = Uy, + Uyy + Uz, 


the Laplacian of U.! 
Green began by considering VU.VV. It is a sum of three terms, and integrating 
each by parts gave him 


[ vosv= | ven.vu) - | VV’U”7 
vol surf vol 


where integrals [ voi are taken over a region D in R? and integrals f surf are taken over 


the surface C of the region D, and n is the outward unit normal vector at a point on 
Cc. 
Likewise, 


[ vs uinvv) - | UV’V, 
vol surf vol 


'Green wrote everything out in full. 
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but switching U and V does not change the integral on the left-hand side, so 


i: vin.vu) — | vu = f uinvyy- [ UV’, 
surf vol surf vol 


and this rearranges to give Green’s identity: 


i uvev - [ yeu = f uin.vy)- [ V(n.VU). 
vol vol surf surf 


The Poisson problem asks for a function V with these properties 


e V’V =F inD and 
eV=fonc 


for given functions F and /f. It reduces to the Dirichlet problem when F = 0. 
Green’s method transforms the Poisson problem into another that might be easier 
to solve. He looked for a function U such that 


e V°U =0 except at one point P in D, where it is infinite like 1/r (and r is the 
radial distance from (x, y, z) to P) and 
e U=O0onC. 


We plug these into Green’s identity and get 


i yeu - [ uvy = f vin.vu) — f U(M.VV). 
vol vol surf surf 


We take these integrals in turn. 


ef. VV7U =u(P) 

bd ae uV?V = Soot UF 

bd surf Vin.VU) = Jee f(a.VU) 
v(n.VV) = 0. 


surf 
So we deduce that 


V(P) a UF + f(.VU). 
vol surf 


This expresses the function V in terms of the given data f and the function U, 
which is known as a Green’s function. The hope is that U can be found because it 
depends only on the shape of D. For simple shapes, Green’s function can often be 
found explicitly, thus yielding a specific solution of the Dirichlet problem. Likewise, 
theorems that establish the existence of a Green’s function for a large class of domains 
(such as Harnack’s theorem, see Sect. 19.3) also solves the Dirichlet problem for those 
domains, and that can be easier to do. 
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18.3 Dirichlet Principle and Problem 


The young William Thomson seems to have been the first to state something like 
this principle. In a paper in Liouville’s Journal for 1847, he claimed that a harmonic 
function on a given region bounded by a surface could be found for which the normal 
derivative at every point on the surface took a given value F. 

He claimed that among the functions for which I. ate V FdS = A, where A is an 
arbitrary constant, there will be one for which the integral 


aVv\? fav\? fav? 
— nae “") | dxdyd 
[, (5) +(S) +(¥) lane 


takes a minimum value, and for this function V 


OV OV OY oe 
ax2 ay? az? 


and the normal derivative of V is a constant multiple of the given value F. The only 
proof he gave was that the result followed by the calculus of variations, which in fact 
only connects the double and triple integrals but does not guarantee the existence of 
a minimum. 

The Dirichlet problem specifies a simply connected domain T with a boundary 
dT, and a continuous function defined on dT. It then asks for a harmonic function 
defined on the interior of T that continuously extends the function defined in dT. It 
is then easy to show that the function is unique—if it exists. Dirichlet had suggested 
in lectures in 1856-1857 at Gottingen that a solution would always exist because of 
a general argument that became known as the “Dirichlet principle”. 

The Dirichlet principle, as given by Grube, states?: 


For every bounded connected domain T there are clearly infinitely many functions u contin- 
uous together with their first-order derivatives, for x, y, z which reduce to a given value on 
this surface. Among these functions there will be at least one which reduces the following 


integral U = f\, (4 4 (# 4 (34 ») extended over the domain 7, to a minimum; it 


is evident that this integral has a minimum since it cannot become negative. We can now 


show the following: 
2 


a 
1. Every such function u which minimizes U, satisfies the differential equation st + 
Ox 
au 7u : . : : 
at ae = 0 everywhere in the domain 7. This already makes it clear that there always 
dy* 
exists a function wu having the desired property, namely that function for which U becomes 
a minimum. 
2. Every function u which satisfies the [above, JJG] differential equation within the domain 
T, minimizes the integral U. 


>The lectures were published posthumously in an edition by Grube in 1876. 
3From Dirichlet ([64], 127-128), quoted in Bottazzini, The Higher Calculus, 300. 


18.3. Dirichlet Principle and Problem 203 


3. The integral U can have only one minimum. It follows from 2 and 3 that there is only one 
function u with the desired property. 


As is often remarked, the problem with these claims is that they assume that there 
is a function that minimises the integral simply on the grounds that the integral can 
never be negative. But that is to confuse the existence of a lower bound with the 
existence of a function for which the integral attains its lower bound. (Compare the 
behaviour of the function f(x) = 1/x on the positive real axis: it is bounded below 
by zero, but there is no value of x for which f(x) = 0.) 

So Thomson had shown that the Dirichlet principle is a claim in the calculus of 
variations, and indeed the Euler-Lagrange equation for the integral U leads to the 
Laplace equation. But that does not vindicate the principle—it merely locates it in 
a family of plausible but unproved claims. We shall now look briefly at the first 
attempts to prove it, and then at other attempts on the Dirichlet problem. 

We also have Dedekind’s statement of the Dirichlet principle, which is quoted in 
a paper by Weierstrass [271].+ There Dedekind wrote that 


Given any finite surface, one can always, and in only one way, endow it with mass so that 
the potential at any point of the surface has an arbitrary, continuously varying, value. 


This is not very clear, but Dedekind followed with a mathematical interpretation: 


As a proof, we offer the following theorem. 


Given any finite connected space f, there is always one and only one function w that, together 
with its first derivatives, is everywhere continuous in ¢t, and on the boundary of ¢ takes 
arbitrarily prescribed, continuously varying values, and satisfies the equation 


a’ f af af 
ax2 ' dy2 ° az2 


=0 


everywhere in ¢. 


Dedekind then noted that an exactly analogous situation holds in the subject of 
heat diffusion, where it is also intuitively evident. Then he went on 

We prove the theorem by drawing on pure mathematical evidence. It is in fact reasonable that 

among all functions u that, together with their first derivatives, are everywhere continuous 


in ¢, and take arbitrarily prescribed values on the boundary of f, there must be one (or more) 
that give the integral taken over the entire space f its least value. 


Dedekind first proved that such a minimiser has the required property, and then that 


it is unique. 


18.4 Riemann on Green’s Theorem 


Riemann lectured on this material in the summer semester of 1861 at Gottingen. His 
lectures were, one could say, his own reworking of what Dirichlet had done, and they 


4Dedekind would have learned this material, as Riemann did, from Dirichlet’s lectures. 
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were published posthumously in an edition by his former student Karl Hattendorff 
in 1875, under the title Schwere, Elektricitdt und Magnetismus (Gravity, Electricity 
and Magnetism). 

In Sect. 21, Riemann showed how to construct Green’s functions to solve the 
Dirichlet problem. First, he derived Green’s theorem following Green’s own argu- 
ment, to which he referred. Then he assumed that there is a function Uj (x, y, z) that 
satisfies Laplace’s equation inside a domain T bounded by a closed surface S and 
that takes the value —1/r on S, where r is the distance from a point (x, y, z) toa 
fixed but arbitrary point P’ = (x’, y’, z’) in T. He deferred proof of the existence of 
the function U, until Sect. 34. Then the function 


1 
U=U,;4+- 
7 


satisfies the Laplace equation in T except at the point P’, where it becomes infinite 
like 1/r, and it takes the value zero on S. 

He now surrounded the point P’ with a small sphere of radius c that lies entirely 
in T. He labelled the interior of this sphere 77, and the complement of this region in 
T he called 7). 

Inside T; the functions U and V satisfy the conditions of Green’s theorem, and 
Riemann considered the limit as c > 0. The integral 


/ VV? (U)dxdydz 
T 
is zero over all of JT and can be ignored. It remained to consider 
= i UV? (V)dxdydz. 
T 


The space T being closed and bounded the integral taken over 7; is bounded for any 
value of c. Inside T, the volume element may be taken to be r? sin @drd6dg, and 
because U only becomes infinite like 1/r the contribution of the integral over T> to 
the whole integral remains finite as c > 0. 

The integral over S, the outer boundary of 77, is 


aU 
i V —do, 
Ss on 


where ou is the normal derivative of U. The integral over the inner boundary, the 
small sphere of radius c, reduces to —4zr V(P’) as c > 0. So Riemann deduced that 


0U 
an V(P') = — it UV2(V)dxdydz + i peas 
E Ss on 
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Riemann next looked at what happens when the function V becomes discontinuous 
in various ways, corresponding in potential theory to the presence of mass on a 
surface, on a line, or concentrated at a point. 

He also showed that the potential function with given boundary values is unique, 
and showed how to find it explicitly for regions of various shapes. He showed that 
if Q’ = (u’, v', w’) is another point in T and Up: and Ug are Green’s functions that 
become infinite at the points P’ and Q’, respectively, then 


Up(Q') = Ug (P’), 


and remarked that this meant that U was a symmetric function of (x’, y’, z’) and 
(u', v', w’). 


18.5 Riemann on the Dirichlet Principle 


Riemann discussed the existence of a potential function in Sect. 34. He noted that 
Green had established it by an appeal to physics, but this left a gap that Gauss had 
filled. Gauss had argued in his ([{117], Sects. 31-34) that given any closed surface § 
in three-space and a continuous function U on the surface, there is a mass distribution 
M on the surface (which may even be negative) such that the potential function V 
of this mass distribution differs from the given function U by a constant, and indeed 
that the mass distribution can be chosen so that V = U. In other words, there is a 
distribution of mass such that the function V + i vanishes everywhere on S. 

But, said Riemann, this proof was too close to potential theory, and a purely 
analytic proof was needed. That, he said, had been provided by Dirichlet in his 
lectures, and he proceeded to describe it. The claim is that any single-valued, finite, 
continuous function v on a closed surface S in three-space can be extended in a 
unique way into the interior T so as to remain single-valued, finite, and continuous 
and satisfy the Laplace equation 

V*v = 0. 


To prove this theorem, he considered the integral over the inside of S 
Qu) = / Vu.Vudxdydz, 
T 


where u agrees with v on S and is continuously differentiable inside S. Evidently 
there are infinitely many such functions, and Riemann denoted one by wu; and any 
other by u = u; + hs, where h is an arbitrary constant and s is a function of x, y, z 
that vanishes on S and has the same properties as u inside S. 

Then, said Riemann, the integral for Q(u) depends on the function u but is always 
positive and finite. Therefore, he went on, there is a particular function v for which 
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the integral (2 (v) takes its least value. This value cannot be zero, which is a constant, 
for then the values on S must all be the same. 
Riemann now considered the function u = v + hs and deduced that 


Qi +hs) = Q(v) + hI + h?Q(s), 


where 
T= [ Vo.W(rdrdyas, 
T 


This forces J = 0, for otherwise when h is very small and J is negative one could 
have 
Q(v +hs) < Q0v), 


contradicting the minimality of Q(v). 
From this Riemann deduced that J = 0 is the necessary, and also sufficient, con- 
dition for &2(v) to take its minimum value. Integration by parts then shows that this 


condition is the same as 
/ sV7(v) =0, 
T 


and because s is arbitrary therefore that 
V*(v) = 0. 


Riemann now deduced that the function v is continuously differentiable, and that 
it is unique. 

He then showed that there is a unique Green’s function U that also solves the 
Dirichlet problem. He considered the function 


1 
U=U,\+-, 
r 


where r denotes the distance of the point (x, y, z) from the point P’ = (x’, y’, z’) 
inside S where the function U is infinite, and U, satisfies Laplace’s equation away 
from P’ and agrees with — 1 on the boundary of the domain. The function 1 /r satisfies 
Laplace’s equation and so the above function is a Green’s function that vanishes on 
S and is harmonic everywhere except at the point (x’, y’, z’) in ST. The uniqueness 
of a harmonic function with given boundary conditions had been proved by Dirichlet 
himself. 
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18.6 Exercises 
1. Green’s functions are not easy to compute; find instructive examples on the web. 


Questions 


1. A chicken and egg question: Which is mathematically more fundamental, the 
existence of a Green’s function for a given domain or a solution of the Dirichlet 
problem for that domain? Or are they equivalent? 

2. Are there examples of problems in physics that lead to an unsolvable Dirichlet 
problem, or is the problem one for mathematicians only? 


Chapter 19 ®) 
Attempts on Laplace’s Equation cies 


19.1 Introduction 


The work of Green, Gauss, and Riemann showed very clearly that the study of real 
functions of two and three variables was a rich domain that would be essential in 
the study of physics (gravitation and electro-magnetism), and which (in two dimen- 
sions) was a powerful tool in the emerging subject of complex function theory. For 
physicists such as Thomson, Helmholtz, and Maxwell, nature provided the existence 
and uniqueness theorems upon which the theory rested, but for mathematicians, and 
especially those a step or more away from theoretical physics, those theorems looked 
increasingly insecure. 


19.2. Weierstrass, Prym, and Schwarz 


Weierstrass was not persuaded by the Dirichlet principle, even for planar regions. In 
a paper he read to the Royal Academy of Sciences in Berlin in 1870, but which was 
not published until the second volume of his collected works in 1895, he agreed that 
if the Dirichlet integral exists and attains its minimum then the minimising function 
is harmonic and unique. But when he turned to the existence question, he offered 
what he called a simple example to show the inadmissibility of Dirichlet’s reasoning. 


He observed that 
1 dy a 
J =i (<2) dx, 
=i dx 


where y(—1) = a £ b = —p(1) is always positive and can take any non-zero value 
however small, but cannot take the value zero unless a vanishes on the interval 


[—1, 1], which is ruled out by the boundary conditions. 
Weierstrass’s example was the function 
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Fig. 19.1 The graph of 
arctan(x /0.01)/(arctan(1/0.01)), 


-l<x<l 


a+b b—aarctan(x/e) 
9, 2  arctan(1/e)’ 


p(x) = 


where ¢ is arbitrarily small and positive.' The graph of this function with ¢ = 
0.01,a = —1, b= 1, in Fig. 19.1 suggests that this function is likely to do the trick. 


b-—a E 
2 arctan(1/e) x2 + €2’ 


b 2 1 2 

—a XE 
——- > ]} ax. 

(scams) [ (= + z) 


The integrand is always less than (2) and Cy. and takes its maximum value of 7 
at x = te, 

Weierstrass now argued that the integrand is positive and less than ¢/(x? + €7), 
so the integral is less than 


oi) = 


so the integral is 


e (b—a) 
2 arctan(1/<) 


So the integral can be arbitrarily small for a suitable function y(x) which has a 
continuous first derivative, but can never be zero. So, he concluded, 


The Dirichlet principle leads in this case to a false result. 


It is curious that Weierstrass did not use the simpler straight line version, given 
by 
1 ifl/n<x<1 
y= ynx if —1l/n<x<I1/n 
-lif -—l<x<-tI/n. 


In this case, 


Nt helps to recall that 4 arctan(x/e) = e/(e2 + x”). 
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1 2 l/n 
J =) (<2) dx -|/ : n?x?dx = = 
_1\ dx ae 3n 
which becomes arbitrarily small as n — oo. It is unlikely he did not think of it. More 
likely, he rejected it because the theory of admissible curves that were not smooth 
was much under discussion at the time and he did not want to weaken his critique of 
the Dirichlet principle with extraneous considerations of that kind. 

Weierstrass’s argument destroys the belief that any integral that is bounded below 
attains its bounds. But it leaves open the possibility that the Dirichlet integral may 
do so, and that the Dirichlet principle leads to a solution of the Dirichlet problem. 
However, a former student of Riemann’s, Friedrich Prym, showed in his [228] that 
Riemann’s use of the Dirichlet principle can fail, even when Dirichlet’s problem can 
be solved. 

Prym did so by exploiting an idea of Riemann’s hitherto ignored in studies of the 
problem, namely, that a continuous function may oscillate wildly. Such functions may 
depart from their Fourier series representations, unlike the ones studied by Dirichlet 
that had only finitely many maxima and minima. 

Prym considered (in Sect. 8) a disc in the plane of radius R < 5 centred on the 
origin, and took polar coordinates p and 7 centred on the point (—R, 0), so p takes 
every value from 0 to2R < | andT every value from —7 to 7. On this disc he defined 
the function u as the real part of the complex function 


u+iv=i/—In(R+x+iy) 


that was taken to satisfy — In (R + x + iy) = —Inp — it. He now showed that this 
function is defined and continuous at the origin of the polar coordinates. 

The Dirichlet problem is solved, because the functions uw and v are everywhere 
defined and single valued, even on the boundary of the disc, and the function u is 
harmonic because it is the real part of a complex function. However, as Prym then 
showed, Dirichlet’s integral L (u, 0) is infinite, because the function u oscillates 
infinitely often in any neighbourhood of the point p = 0. 

Prym’s contribution left open the question of whether the Dirichlet problem could 
be solved. The leading figure here was Hermann Amandus Schwarz (Fig. 19.2), and 
in a series of papers around 1870 he was able to solve the problem in a fair degree 
of generality.” 

In his paper [244], Schwarz first solved the problem when the domain T is the unit 
disc. He considered an arbitrary function f on the unit circle that is finite, continuous, 
and real-valued everywhere (so it corresponds to a periodic function defined on R). 
He then wrote down the function u(r, ¢) that solves the Dirichlet problem. It is 
defined by the following equations: u(1, 6) = f(@) and 


? Another way forward was an iterative process described by Carl Neumann, whose work will not 
be discussed here. 
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Fig. 19.2. Hermann 
Amandus Schwarz 
(1843-1921) 


20 ‘= 2 


1 
cs an Jo FW); — 2r cos(y — ¢) +r? 


dv, O<r<l 


A careful analysis shows that it is finite and continuous in every closed unit disc 
0 <r < 1, and converges to a function that is finite and continuous on the whole 
closed unit disc. Once this is established, straightforward differentiation shows that 
the function u(r, @) satisfies the differential equation Au = 0 in the interior of the 
disc. 

It follows that the Dirichlet problem is solved for any domain conformally equiv- 
alent to the unit disc, but Riemann’s claim that this was true of any simply connected 
domain was far from understood or accepted (or, it should be said, precise). Indeed, 
one of Schwarz’s earliest papers [242] had been to establish the equivalence precisely 
for a disc and a square, thus giving a new twist to the famous problem of squaring the 
circle.* (This is an early example of what was to become the Schwarz—Christoffel 
theorem at work.) So Schwarz next presented a method of extending the solution 
from domains where the problem was solved to domains formed by overlapping 
such domains in the plane. This gave him a large class of domains for which the 
Dirichlet problem had a solution. 


3For his proof that a square can be mapped analytically onto a circle, see a translation of his paper 
below, in Sect. 31.3. 
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19.2.1 Schwarz’s Alternating Method 


Schwarz in his [243] considered two domains, 7; and 7> that overlap in such a way 
that T* = T, N T>, the region common to both of them, is two-dimensional and the 
boundaries of the regions cross with distinct tangents. 

He supposed that the Dirichlet problem on 7; can be solved for any prescribed 
(finite, continuous, real-valued) function g; on the boundary of 7), and likewise that 
it can be solved on 7, for any function gz on the boundary of 7. He then showed 
that there is a a solution to the Dirichlet problem for the domain 7; U 7 with any 
prescribed function u on the boundary Lo U L3. This function will be bounded below, 
and so without loss of generality Schwarz assumed that u > 0. 

He divided the boundary of 7; into two parts, one outside 7> that he called Lo, 
and one inside 7, that he called L2. Similarly, he called L3 the boundary of 7, that 
lies outside 7; and L, the part of the boundary of 7) that lies inside 7,. The region 
T; U Ty \ T* he called T. (For a picture, see Fig. 31.1.) 

Schwarz’s idea was that an air pump (his term) could be imagined that pumped 
air from the region T* alternately into the regions T, \ T* and 7, \ T*, through 
the membranes L, and L>. In mathematical terms, he supposed that the Dirichlet 
problem is first solved on T; with the boundary values uv; = u on Lo and a constant, 
k on L» (k will later be chosen to be the minimum value of the function u on the 
boundary Lo U £3). The solution is a harmonic function u,, say, inside 7,. He then 
chose the solution of the Dirichlet problem for the region 7, for a function which 
took the same values as the function u on L3 and the values of u; on L;. He then 
treated the region 7; as he had just treated 7) to obtain a harmonic function w3, then 
turned to Ty again to obtain a harmonic function u4, and so on. 

Why does this help? The maximum and minimum values of a harmonic function 
are taken on the boundary of its domain, and the maximum and minimum values of the 
function that is vu on Lo U L3 are, say, g and k, respectively, and define G := g — k, 
and we note that on L» the maximum value of the function uz — uw, is less than g — k. 

Next, the function u3 — u; solves the Dirichlet problem where the boundary func- 
tion is 0in Lo and uz — k on Lo. So u3 — uy is never negative inside 7), its maximum 
value is less than G, and its maximum value on L, is less than Gq,, where q; < 1. 
A simple scaling argument shows that g, depends linearly on the maximum value of 
the boundary function on L3. 

Similarly, the maximum absolute value of u4 — u2 on L is less than Gq; and on 
Lz is less than Gq q2, where q2 < 1. 

Continuing in this way, Schwarz obtained two sequences of functions, {u2;-1} 
defined on 7; and {u;} defined on 7>, with the properties that along L1, v2; 1 = U2; 
and along Lo, u2;+1 = u2;. He then defined two new functions 


ul = uy + (U3 — 1) + (Us — 3) +... + (Wain — Mai-1) +... 


N 


u" = uy + (ug — U2) + (Ug — Ug) +... + (Uaig2 — Uri) +... 
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These series converge unconditionally because successive terms diminish faster than 
a geometric progression with ratio giq2. (The geometric progression arises from the 
linearity observation made above.) The function u’ is harmonic inside 7\, and the 
function u” is harmonic inside T>, and they agree on the entire boundary of T*, which 
is L; and L. Therefore, the functions u’ and u” agree on T*, and so they define a 
harmonic function on the whole of 7; U T> that has the prescribed boundary values 
of the function u on the boundary Lo U L3. So the Dirichlet problem is solved for 
the union of the two regions, which is what Schwarz set out to do. 

The alternating method provides a solution to the Dirichlet problem for a large 
class of regions, including all plane domains with polygonal boundaries with finitely 
many sides. In general, the boundary can be made up of piecewise analytic arcs 
crossing transversally, but it is not clear what can happen in the limit, so the case 
of arbitrary boundaries, even rectifiable ones, was left unresolved. That said, Con- 
stantin Carathéodory praised Schwarz in Schwarz’s Festschrift volume ([30], 20) 
for separating out the interior part and the boundary part of the Riemann mapping 
theorem, and indeed Poincaré’s method of sweeping out (1890b) similarly made 
certain simplifying assumptions about the boundary but left the extent of the method 
unresolved. 

On the other hand, as Archibald ([1], 83) points out, Schwarz’s paper required 
familiarity with Weierstrassian methods to understand, and such knowledge was only 
available to those with access to copies of Weierstrass’s lectures and notes taken by 
the few students capable of doing so. Archibald, quoting ([65], 154), records the 
astute opinion of Gésta Mittag-Leffler, who was a significant figure in spreading the 
Weierstrassian model of analysis: 

The Germans themselves are not in general sufficiently familiar with Monsieur Weierstrass’s 

ideas to be able to grasp without difficulty an exposition made strictly on the classical 

model that the great geometer has given. Take, for example, Monsieur Fuchs ...he regards 

[Weierstrass’s] methods as thoroughly superior to the method of Riemann. And yet he always 

writes in the manner of Riemann. All this evil derives from the fact that M. Weierstrass has 


not published his courses. It is true that the Weierstrassian method is taught in several German 
universities, but everyone is not yet a pupil of Weierstrass or a pupil of one of his pupils. 


19.3. Harnack 


The study of the Dirichlet problem for general two-dimensional domains was much 
advanced by Axel Harnack in his book [138]. He began by reviewing the theories 
of Schwarz and Neumann, and observed that these authors had not fully studied the 
nature of the boundary before admitting, however, that he had not been able to extend 
their methods. Therefore, he had adopted a different approach using Green’s func- 
tions. He established existence theorems for functions with prescribed singularities, 
derived the general theorems in Riemann’s paper on Abelian functions, and showed 
how his ideas led to a proof of the Riemann mapping theorem. 


4Harnack restricted his attention to this case because of the availability of conformal mappings. 
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Harnack’s book was well received, and subsequent mathematicians often made 
use of what became known as Harnack’s theorem ([138], 67). Harnack considered 
a sequence of harmonic functions u,, that are defined on a surface F and restrict to 
continuous functions U, on the boundary of F’. He furthermore supposed that for 
every arbitrarily small 6 and for every point s of the boundary there is a finite domain 
partly bounded by a piece of the boundary of F containing s and that contains interior 
points of F and is such that the values each function u,, takes on this domain (including 
its boundary) vary by less than 6. As he remarked, it is a necessary condition that 
the functions U,, be continuous. Harnack first established the lemma that if the sum 
U := XU, converges uniformly, then the sum u := Xu, converges at every interior 
point of the surface F to a harmonic function. From this he deduced Harnack’s 
theorem: 


Theorem 19.3 [fa sequence of harmonic functions uy all have the same sign (say, 
positive) and the sum u := Xiu, converges at an interior point of the surface F,, then 
it converges at every interior point of the surface F to a harmonic function. 


Alternatively, if a sequence of harmonic functions u, tend from below to the values 
of a function u, then u is a harmonic function. 

Informally, the sequence of harmonic functions either converges to a harmonic 
function or it fails to converge at all. 

Harnack’s book contains a number of advances in what later became point-set 
topology. He defined a domain to be connected if any two points in it can be joined 
by a finite polygonal arc that can be covered by overlapping discs all lying in the 
interior of the domain. Later authors refined this idea and separated the idea of 
connectedness (here, path-connectedness) from that of a domain (a region for which 
every point has a disc-like neighbourhood lying entirely in the region). Harnack also 
defined boundary points to be those points every neighbourhood of which contains 
some points belonging to the domain and some that do not. He then claimed that a 
simply connected domain has a continuous boundary, although the boundary may 
have corners and cusps, be nowhere differentiable and may, implicitly, need not even 
be rectifiable.” A boundary might also have arbitrary number of incisions, lines drawn 
inwards from boundary points, which would be traversed twice by any circuit of the 
boundary (Fig. 19.3). 

It was with this idea of a domain and its boundary in place that Harnack then proved 
that there is a Green’s function for every bounded region with an arbitrary boundary. 
He argued in three stages. First, he accepted Neumann’s approach establishing the 
existence of a unique harmonic function that agrees with a given continuous function 
on the boundary for polygonal regions with no re-entrant angles. Then he showed 
that if the given function on the boundary is always finite but has isolated jump 
discontinuities then there is still a harmonic function agreeing with the given one at 
points on the boundary where the given function is continuous. Finally, he used an 
approximative argument, which he attributed ultimately to Schwarz, to deal with the 


>Riemann in his [234] had defined a domain as simply connected if any curve in it joining two 
boundary points divides the domain into two pieces. 


216 19 Attempts on Laplace’s Equation 


Fig. 19.3 A circuit of the 
disc with the incision AB 

traversed AB twice—once 
going up and once coming 
down 


general simply connected domain. He considered that the domain can be steadily 
approximated by polygonal regions. He used his theorem from p. 67 to establish 
that the sequence of harmonic functions on these domains converged to a harmonic 
function on the given domain. Arbitrary bounded domains were then patched together 
out of simply connected pieces. To prove the Riemann mapping theorem Harnack 
used his Green’s function approach to establish the existence of a suitable harmonic 
function and consequently of a complex function mapping the given bounded domain 
onto a circle. He also showed in this way that non-simply connected domains can 
be mapped onto a domain bounded by several circles (a problem Riemann had also 
investigated in work that was still unpublished). 

In the event, it was to turn out that the Dirichlet problem can be solved for a 
very large class of boundaries of a two-dimensional, disc-shaped region, but that the 
problem in three dimensions can only be solved for a restricted class of boundaries 
(without spikes, for example). For that reason, and because of the strong connection 
to complex function theory, I have kept the story that follows to two dimensions, and 
even then the full history of potential theory in the period is too rich to describe here. 
For a look at some of the major issues that were raised, and how some of them were 
solved, see Appendix D. 


19.4 Exercises 


1. Find some domains homeomorphic to a disc whose boundaries are not homeo- 
morphic to circles. 


Questions 


1. Try to follow through the first few stages of Schwarz’s alternating method when 
the initial values on Lo are | and the initial values on L3 are 3, noting the values 
assigned at each stage to L; and L2. Does it strike you as likely that the method 
will lead to good approximations to the sought-for harmonic function? 


Chapter 20 ®) 
Applied Wave Equations sive 


20.1 Introduction 


The wave equation is one of the most useful in physics. Here we look at the dra- 
matic story of the trans-Atlantic cable and the later introduction of the telegraphist’s 
equation. 

The best resource for the history and context of the trans-Atlantic cable up to the 
present day is surely the History of the Atlantic Cable & Undersea Communication 
at atlantic-cable.com. See, among other things, Bern Dibner’s The Atlantic Cable 
(1959). When I gave this course in 2017 it seemed to me that everything was here 
except an explanation of the mathematics. Since then I am pleased to say that Liam 
Morris, a student on the course that year, has posted an account of the mathematics. 


20.2. The Trans-Atlantic Cable 


The wave equation was at the mathematical heart of a dramatic nineteenth-century 
story: the struggle to connect Britain and America by a trans-Atlantic cable. 

The first working electric telegraphs were produced in the 1830s, and soon a 
network of cables crossed Europe and spread throughout the eastern seaboard of the 
United States. Information could now be sent reliably, and very much faster than by 
aman on a string of horses, and typically it came transcribed letter by letter into a 
stream of short and long pulses—such as the dots and dashes of Morse Code. 

However, it was not so easy to connect Britain to Continental Europe. It was 
discovered that placing a cable under water increased its capacitance (the ability of a 
body to store electric charge). The increased capacitance caused the signal to spread 
out, so that the gap between one item and the next had to be increased, causing the 
time for a message to be transmitted to increase. The English Channel is not very 
wide, at its narrowest it is only some 22 miles. It was much riskier to run a cable 
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across the Atlantic, but it would surely be extremely valuable and therefore attempts 
had to be made. 

To understand the problem, it is necessary to consider what is called the tele- 
graphist’s equation. This equation describes the current u at any point x ina straight 
wire at any time ¢ during the transmission of electric signals down the wire. It is 


O*u 


Ou 
KL— KR+LS)— + RSu = 
2 + ( + a + RSu 


Ou 


x2’ (20.1) 


where the constants that appear in the equation involve the capacitance K, the self- 
inductance L, the resistance R, and the leakage S of the wire. (Self-inductance is the 
induction of a voltage in a current-carrying wire as the current changes.) 

It was written down for the first time by the German physicist Gustav Kirchhoff in 
1857, and profoundly studied by the brilliant but eccentric English physicist Oliver 
Heaviside in 1876, but prior to them William Thomson had cleverly exploited a 
simpler equation to design a cable that would work across the Atlantic.! 

The equation Thomson derived for the flow of electricity down a wire was the 
heat equation. Although he did this on physical grounds, it was nonetheless the case 
that this equation and Fourier’s study of it were at the core of his thinking. This 
was not Eq. (20.1) later derived by Kirchhoff, but it can be obtained from it when 
the inductance L is negligible by comparison with the resistance R, so the constant 
KL may be taken to be zero, and the equation becomes the one-dimensional heat 
equation, 


Thomson, however, was not aware of self-inductance so his methods take no account 
of it. 

It is worth noting that mathematicians often seek to understand a complicated 
equation such as this one by simplifying it. For example, in the case at hand, if one 
assumes there is no resistance (R = 0) and no leakage (S' = 0) then the equation 
reduces to the wave equation: 


poe _ O*u 
Ot2 Ox?” 


Thomson’s simpler equation implies that an instantaneous pulse sent down a wire 
of length x lasts for a time 7 proportional to x? seconds (see below), and so two 
separate pulses must be transmitted T seconds apart in order to be received as distinct 
signals at the far end. But in 1855 Thomson’s advice was ignored. 

Attempts to lay the cable were dogged by failure. The first cable, laid in 1857, 
snapped after 338 miles. A second cable, laid in 1858, succeeded, and a 99-word 
message was sent from Queen Victoria to President Buchanan to mark the event, 


'See Thompson [256], Kirchhoff [155], Heaviside [140], and Rayleigh’s Theory of Sound, Vol. 1 
p. 466. 
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Fig. 20.1 William 
Thomson, later Lord Kelvin 
(1824-1907), Memorials of 
the Old College of Glasgow, 
Glasgow, 1871 


but this only revealed a greater failure: the message took 165 hours to transmit. 
Misguided attempts to improve performance made things worse, and after a month 
the cable had to be abandoned. As the mathematician Thomas Korner put it, “2500 
tons of cable and £350, 000 of capital lay useless on the ocean floor”. 

Thomson (Fig. 20.1) had opposed the original design, and was now placed in a 
position to insist on his insights being implemented. Now, “Half a million pounds 
was being staked on the correctness of the solution to a partial differential equation”.* 

As before, the first attempt at laying the cable, which was much heavier than the 
earlier one, failed when the cable broke, and another one had to be laid. But they were 
able to recover the ends of the broken cable and reconnect it, and on 8 September 
1866 America and Europe were joined by two cables that, moreover, worked as 
planned. Signals could be transmitted at roughly eight words a minute, a decisive 
improvement on the earlier, and by then defunct, cable. Thomson was knighted and 
further rewarded with considerable amount of money, some of which he used to buy 
an ocean-going yacht—he was a keen sailor. 

Thomson’s telegraphist’s equation is the heat equation. There is only space here 
to describe his solution, not to prove it.* 

The problem is that of the distribution of heat in a semi-infinite one-dimensional 
rod x > 0, which may not make much sense as a problem about heat but makes good 
sense if we think of the rod as a wire. 

Define the functions f_(w) = exp (- en) and f, (w) = exp (- eeu) Then 


4Kt 4Kt 
the function 


6 [oe] 
B(x, 1) = / (fw) — fy (w))dw 


is a solution of the heat equation 


2See Kérner ([164], 334). 
3See Korner ([164], 336). 
4For the mathematical details, see Korner ({164], Chap. 62). 
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Fig. 20.2 The graph of 
P(x,t), O<x<5,0< 
t < 0.08 


Oe N= K oO 
— (x,t) = K— 
Ot Ox? 
for which 0(x, t) > 9) ast — O for all x > 0 and 0(0, t) = O for all t > 0. 
If instead it is required that 6(0,t) > f(t) as x > 0+ for all t > 0, then the 
solution becomes 


x2 
60,1) = = [roe 32 0 (-pa—)as 


So if the question becomes what is the result of briskly heating one end—by a 
function f (f) that is zero outside of a small interval (say f(t) = 0, t > a)—then the 
above expression is the answer. 

In these circumstances, the solution is well approximated by 


A(x, th= - f(S)P(, tds, 
0 


where 
P _ Xx x2 
@.0) 2n'/2(K t)3/2 er ay 


So if the input is an initial, short blip—a pulse concentrated in a very short interval— 
the output at a point x at time ¢ is given by the graph in Fig. 20.2. Slices for constant 
t show the shape of the pulse at that time; slices for constant x show what happens 
at that point as time goes by. 

If t = fp, aconstant then P(x, fo) is a function of x, and 


O P(x.) = 1 x? ; 2x? 
Ox sae 271/2(K ty)3/2 "cn Kto Kto , 


The maximum value occurs at 
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Kto y ae 
x=,/— andis ——~—_. 
2 4nl/2 K to 
So when fp is small the maximum is quite large and occurs for a small value of x, 
but when fo is large the maximum is quite small and occurs for a large value of x. 
This confirms what the graph suggests, that the signal travels down the wire getting 
weaker as it goes. 
If x = xo is aconstant then P(x, t) is a function of t, and 


2 ee == 0) (x, 9% 
— P(xo, t) = ————————X ex : 
ar 4n}/2(Kt)9/2t OS Kt 3 


The maximum value occurs at 


' 2. 05 di ( 3e-°77 ) 1 
==—Xx andis (—.—~]—. 
3K 25/22 ) x2 


So when xo is small the maximum is reached very quickly and is quite large, but 
when xo is large the maximum is quite small and occurs for a large value of t. This 
again confirms what the graph suggests, that the signal travels down the wire getting 
weaker as it goes. 

We see that the output is a pulse that rises very quickly from zero to a maximum 
and then steadily and slowly declines to zero. As a result, the solution of the heat 
equation is immediately non-zero everywhere, rises to a maximum that is inversely 
proportional to x? for a time proportional to x”, and then declines. 

We can also see that when xo is large, the pulse broadens as it travels by an 
amount proportional to the square of the length of the distance it has travelled. For 
in 2 P(x, t) the first term in ¢ is ¢~7/* and so very nearly zero, and in particular 
almost a constant, and the exponential term is likewise very nearly | and therefore 
also essentially a constant. Therefore, the variation in P (xo, f) around its maximum 
is determined by the final term in t, which is quadratic in x9. 

Moreover, all these values depend on the value of K, so that is where the physics 
comes in: finding the materials that give the best value for the shape of the pulse. In 
particular, K should be as small as possible to keep the pulse sharp. 


20.3 Poincaré’s Solution 


The full telegraphist’s equation was solved for the first time by Poincaré in 1893.° 
In his short paper [217], he took the telegraphist’s equation in the form 


5Poincaré had been fascinated by telegraphy as a boy and was eager to explain how it worked to 
anyone, especially family members, how it worked. He continued this interest throughout his life, 
writing on wireless telegraphy too. 
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Pad go a ee 
a a oe 


He then supposed that the physical units were so chosen that this equation becomes 


av rae rv 
Ot? Ot Ox?’ 


and the velocity of the signal is that of light (and is | in these units). He then set 
V = Ue™ and reduced the equation (as Fourier had done in his study of heat) to 


PU &U a 
Ot? Ax? vie 
In so doing, he assumed that B? — 4AC is non-zero. 

He now looked for a solution corresponding to the initial conditions U = f and 
ou = f; where f and f; vanish outside the interval a < x < b and are polynomials 
in between. He obtained the solutions by the method of Fourier transforms, which 
lies outside this course, and was able to show that the solutions are a combination of 
two basic types, one in which the initial conditions are f = 0, f; A O and the other 
in which f £0, f; =0. 

In the first case, the solution is a certain Bessel function A(x, t) for -—t < x <t 
and zero otherwise.° This is an interval of length 2. 

In the second case, the solution is a more complicated expression involving the 
same Bessel function that is non-zero outside the interval a — t, b + t, which is also 
an interval of length 2r. 

This means that what goes in as a pulse of width b — a comes out as an interval 
of increasing width. However, Poincaré was able to show that if b — a is very small 
then the solutions are 


1 
UG,t) = 5f@—1), a+t<x<b+t, 


1 
UG) = f+, a-t<x<b-t, 


and U(x, t) = 0 otherwise. More precisely, the other terms in the solution depend 
on b — a and are negligible if b — a is very small. The same is true of the solution 
to the original equation: V(x, t) = U(x, the. 

However, if b — a has a finite size, then the solutions will take the form of a pulse 
with a head and then a tail of length proportional to ¢ and therefore to x, the length 
of the wire. As he put it, if a pulse of some simple kind is transmitted between times 
t=aandt=b 


Bessel functions arise in the oscillations of a hanging chain, and are standard fare in applied 
mathematics. 
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one sees first of all that the head of the perturbation will travel with a certain speed, in such a 
way that in front of this head the perturbation is zero, contrary to what happens in Fourier’s 
theory of heat and in agreement with the laws of propagation of light or of plane sound waves 
deduced from the equation of the vibrating string. But there is an important difference with 
this latter case, because the perturbation, as it propagates, leaves behind a non-zero residue 
.... Ifb — ais small ...the residue is negligible in front of the principal perturbation, but this 
is not the case if the perturbation lasts for a long time and if b — a is finite. The residue can 
then disturb the observations, 


Therefore, when an attempt is made to transmit a periodic wave down the wire, the 
velocity and wavelength depend on the frequency, the waves undergo dispersion, and 
the head of the disturbance moves with a finite speed. This is also the case with the 
transmission of light but not of heat, and the head, once it has passed, leaves behind 
a disturbance which never vanishes, unlike what happens with the wave equation. 

Poincaré was apparently unaware of a remarkable discovery that Heaviside had 
made in 1887, when he showed that the values of the physical constants can be 
so adjusted that the rate of dispersion is zero. This can be done both mathemati- 
cally and physically, it merely requires that the leakage be non-zero. Far from being 
an inconvenience, this condition is necessary for the production of distortionless 
telephony. The signal becomes fainter over distances, but this can be corrected by 
fitting amplifiers. Long-distance telegraphy had dealt with distortion by accepting a 
low transmission rate, so as to separate the pulses. Telephony required much higher 
frequencies; with some leakage and a deliberately high self-inductance it became 
distortionless. Long-distance communication was reborn—although the money for 
the first successful patents went to the American electrical engineer Michael Pupin 
in 1901, and not to Heaviside.’ 

It is intriguing to see that Poincaré also failed to mention Heaviside’s ingenious 
discoveries in his lecture course in 1894, Cours sur les oscillations électriques. There 
he surveyed a considerable amount of mostly French experimental work, with a view 
to deciding between the old theory of electro-magnetism (due to Kirchhoff) and the 
modern theories of Maxwell and Hertz. The reason may have been a misplaced inter- 
est in the general case. Poincaré’s analysis of the telegraphist’s equation depended 
on the condition B*? — 4AC # 0 or KR ¥ LS, but equality in these cases is exactly 
the condition upon which Heaviside’s insight depends. So the experimental work 
was given a theoretical twist and technological implications were not mentioned. 


20.3.1. Conclusion 


The principal three partial differential equations that we have considered, the heat 
equation, Laplace’s equation, and the wave equation, became known as the differ- 
ential equations of mathematical physics. It is a striking fact that between them 
they describe so many of the advances in applied mathematics made in the nine- 
teenth century and into the twentieth, and it is fortunate that in many cases they can 


7See Yavetz [276]. 
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be solved when appropriate boundary or initial conditions are specified, for, as the 
telegraphist’s equation indicates, rigorous solution methods for general partial differ- 
ential equations are hard to find. But it is remarkable how much of the modern world 
was made possible by the study of the calculus of functions of several variables. 


20.4 Exercises 


Questions 


1. The telegraphist’s equation can also be seen as a variant of the wave equation 
(technically, in the language of a later chapter, it and the wave equation are both 
hyperbolic partial differential equations). What does it mean that a good—indeed, 
financially successful—understanding of it can be obtained by treating it as the 
heat equation? 


Chapter 21 Mm) 
Revision Cheek for 


21.1 Revision and Assessment 2 


This chapter is given over to revision and discussion of the second assignment, 
see H.3. 

I also recommended that students read some of Sergiu Klainerman’s essay from 
2000: “PDE as a unified subject”. Of course, it is sometimes obscure at this stage. 
Many of the themes that have driven research into partial differential equations in 
the twentieth century have not been broached in this course or, very likely, in any 
undergraduate course. But the first 14 pages, omitting pages 5 and 6, are surprisingly 
intelligible, and in any case they are part of an answer to the traditional request from 
better students to be told something about what research mathematicians do. Perhaps 
more to the point, these 14 pages are a modern reflection on the themes that occupy 
the final part of this book, and will be worth students thinking about them when 
writing their final essay. 

The essay is available on the web at 

https://web.math.princeton.edu/~seri/homepage/papers/telaviv.pdf 
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Chapter 22 M®) 
Riemann’s Shockwave Paper cies 


22.1 Introduction 


Riemann was very interested in mathematical physics. He published four papers on 
various aspects of it in his lifetime, and four more were published after his death. 
Of the papers he published, the one on the formation of shockwaves [236] has at 
least two major claims to fame. It is the first paper to explore the phenomenon, and 
it made a contribution to the theory of hyperbolic partial differential equations that 
is still in use today. 

At the start of this paper, Riemann remarked that just as the study of linear partial 
differential equations had been most fruitful when special physical problems were 
investigated rather than general ones, so too the study of non-linear problems was 
likely to benefit from studying physical problems and taking all factors into account. 


22.2 Riemann’s Paper 


Riemann’s [236]—as its title indicates—is about plane waves of finite amplitude. In 
the papers by d’Alembert, Euler, and others, and several later authors, only waves 
of infinitesimal amplitude had been considered. Poisson had published a long and 
difficult paper on waves of finite amplitude in 1807, and more recently the lead- 
ing German physicist, Hermann von Helmholtz had published two more papers on 
experimental investigations of the subject. In one of them, he was the first person to 
explain the phenomenon of overtones. Then the subject had passed to British applied 
mathematicians, as Riemann noted. 

What is most interesting about Helmholtz’s paper on overtones was his discovery 
that while the superposition of sound waves in the air is linear when the oscillations 
are infinitesimal, they are not linear for waves of finite amplitude. Instead, overtones 
arise when the squared amplitude of the waves exerts a force comparable to that 
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causing the oscillations, and so the partial differential equation describing the motion 
is necessarily non-linear. 

But dealing with such waves was the least of Riemann’s novelties. The paper is 
famous for two things: 


1. dealing with a non-linear second-order partial differential equation with discon- 
tinuous solutions, and showing that the zone of influence is wedge shaped. The 
equation is hyperbolic, and its characteristics represent the path in space-time of 
the signals. 

2. his methods, which proved of lasting significance in investigating hyperbolic 
partial differential equations. 


Riemann considered a compressible gas in which motion takes place along the 
x-axis. At time f and position x, the density is p, the pressure p, and the velocity is 
u. The relation between pressure and density is given by a function 


p= 9(p), 
where all that is known about y() is that its derivative is always positive: y’(p) > 0. 


This says, reasonably enough, that pressure increases with density. 
He obtained these differential equations for p and u (Sect. 1, p. 147): 


oa (pu), 
plu; + uux) = —9'(p)px- 
In terms of A = log p, the first of these equations can be written as 
Ap + UA, = —Uy 


and the second as 
uy + Uy = —y' (p)rx. 


To simplify these equations, he defined 


f(p) = [ewa 


and 1 1 
r= 5 FP) +u), ands = 5 FP) —u). 


The new variables r and s will be shown to be the coordinate variables that simplify 
the partial differential equation. They are also a pair of characteristics, and it may be 
helpful to look at the much simpler topic of Burgers’ equation (see Appendix B.2) 
before proceeding. 
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For brevity, let us also write 


a=ut+/y(p), B=u—Jy(p). 
Riemann now deduced in a few lines that 
ry = —ary 
5; = —Ds,. 
From the formula dr = r,dx + r,dt, he deduced that 
dr =r,(dx — adt) 
and so r is constant along the curve defined by 


dx 
Ae ey 
dt 
and so the point with a constant value of r moves forward with velocity a in the 
direction of increasing x. Similarly, points with constant s move backwards with 
velocity —( in the direction of decreasing x. 

As he put it 


a particular value of r, or of f(~) + u, moves towards larger values of x with velocity 
J '(p) + u, while a particular value of s, or of f() — u, moves towards smaller values of 
x with velocity /y’(p) — u. 


A definite value of r will gradually meet with each value of s lying ahead of r, and the 
velocity of its progress will depend at a given moment on the value of s with which it meets. 


A further calculation, the details of which I omit, led Riemann to observe that the 
differential 
(x — at)dr + (x — Bt)ds 
is exact, and if it is set equal to dw then w satisfies the partial differential equation 


Wrs = m(w; + Ws); (22.1) 


where m is a function of r + s. In fact, on setting f(o) =r+s=a, 


However, if standard hypotheses about gases are admitted (Poisson’s and Boyle’s 
law) then, Riemann showed, it is possible to reduce to the situation where m = — x, 
where a is a constant of proportionality in Boyle’s law. This formulation depends on 
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r and s not being constants, and Riemann also looked quickly at the situation where 
either r or s is constant (if r is constant then w is a function of s alone, and if s is 
constant then w is a function of r alone). 

The change of coordinates from x and ¢ to r and s depends on r and s not being 
constant, and so, as Riemann noted, his method gives no information about any region 
in which r or s is constant. Moreover, the coordinate change is only valid where the 
Jacobian is finite and non-zero, and this Jacobian is 


2/ p' (P)rxSx.- 


The cases r, = 0 ands, = 0 have been discussed. What is much more interesting 
is Riemann’s argument about p as a function of x. From its definition it follows that 
the graph of p as a function of x varies in time, and the higher values of p increase 
faster than the lesser values. So if the graph is an increasing function it evolves as time 
goes by into a less steep function. But if the graph is that of a decreasing function, it 
can evolve into the graph of a multi-valued function of x, which is absurd. For this 
to happen, Riemann showed, it is enough that r, = oo. 

Riemann had now arrived at the linear second-order partial differential equa- 
tion (22.1). He now began to single out a number of important features of its solutions, 
but first, in Sect. 4, he made some general observations of lasting significance. 


We treat first of all the case where the initial disturbance of equilibrium is restricted to a 
finite region defined by the inequalities a < x < b. Thus outside this interval, u and p, and 
consequently r and s, are constant. The values of these quantities for x < a are denoted 
with suffix 1; for x > b suffix 2. The region in which r is variable gradually moves forward 
according to Section 1, its lower bound having velocity /~’(p1) + 1, while the upper bound 
of the region, in which s is variable, moves backward with velocity /y’(p2) — u2. After a 
time interval 


b-a 
Sel (pir) + Ve" (pa) + a — v2 


the two regions separate, and between them a gap forms in which s = s2 and r = 71, and 
consequently the gas particles are again in equilibrium. Thus from the initially disturbed 
location, two waves issue in opposite directions. In the forward wave, s = s2; accordingly, 
to a particular value p of the density is associated the velocity u = f(p) — 2s2, and both 
values [i.e. of density and velocity, JJG] move forward with constant velocity 


Vo (p) tu = Vo'(p) + fp) 2582. 


In the wave moving backward, on the other hand, the velocity — f() + 2r, is associated to 
the density p, and these two values move backward with velocity /y’(p1) + f(p) — 2r1. The 
rate of propagation is greater for greater densities, because both f(p) and ./y’(p) increase 
with p. 


If we think of p as the ordinate of a curve for the abscissa x, then each point of this curve 
moves forward parallel to the x-axis with constant velocity. Indeed the greater the ordinate, 
the greater the velocity will be. It is easy to see that, according to this law, points with 
greater ordinates would finally overtake preceding points, with smaller ordinates, so that to 
a given value of r would correspond more than one value of p. Since this cannot occur in 
physical reality, a condition must enter that renders the law invalid. In fact, the derivation of 
the differential equation is based on the assumption that u and p are continuous functions of 
r having finite derivatives. However, this assumption ceases to hold as soon as the density 
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curve is perpendicular to the x—axis at some point. From this moment on, a discontinuity 
appears in this curve, so that a greater value of p immediately succeeds a smaller value. This 
case will be discussed in the next section. 


The compression waves, that is, the parts of the wave in which the density increases in 
the direction of propagation, become ever narrower with their forward progress and finally 
become compression shocks. However, the width of the expansion waves grows in proportion 
to elapsed time. 


We may easily show, at least under the assumption of Poisson’s (or Boyle’s) law, that in the 
case when the initial disturbance of equilibrium is not confined to a finite region, compression 
shocks must also form in the course of the motion, excluding quite special cases. The velocity 
with which value of r moves forward is 

k+1 k—3 


5 Fg 


Ss 


under this hypothesis. Thus larger values will, on average, move with greater velocity. A 
larger value r’ must eventually overtake a preceding smaller value r”, unless the value of s 
corresponding to r” is, on average, smaller by 


1+k 
/ ” 
(r a 


than the value of s simultaneously corresponding to r’. In this case, s becomes negatively 
infinite for positive infinite r, and thus for x = ++oo, the velocity u is +00 (or instead, the 
density, according to Boyle’s law, becomes infinitely small). Thus excluding special cases, it 
must always transpire that a value of r, larger by a finite amount, follows immediately after a 
smaller value. Consequently, since or becomes infinite, the differential equations lose their 
validity, and forward-moving compression shocks must occur. 


In the next sections, Riemann showed how the compression shocks propagate, he 
showed that the values of u and p on either side of the shock are linked. Riemann, 
however, failed to ensure conservation of energy; the relevant relations were provided 
in Rankine [232], Rayleigh [233], and Hugoniot [147, 148]. But Riemann did notice 
that the shocks must be supersonic with respect to the state in front of them and 
subsonic behind.! 

The analysis naturally depends on the initial conditions, and he showed that when 
u and p each have two different constant values on x < 0 and x > 0 two waves 
emerge from the point of discontinuity and each could be either a compression or a 
rarefaction wave. He analysed all four cases. 

In the final sections of the paper, Riemann did not show how to solve the partial 
differential equation (22.1) but how to transplant the method of Green’s functions 
from the elliptic to the hyperbolic case. His method of defining and using the adjoint 
of the given partial differential equation in order to solve the equation has since 
become standard. 

His method was extremely ingenious. He wished to solve a partial differential 
equation by a function that, with its first derivatives, takes given values on a given 
(non-characteristic) curve. To do this, he introduced a new partial differential equation 


Rankine cited four previous authors: Poisson, Stokes, Airy, and Earnshaw, but not Riemann; 
Riemann cited only Helmholtz; Rayleigh and Hugoniot did not mention Riemann. The first person 
to follow Riemann was E.B. Christoffel in his [41]. 
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(technically, this is the adjoint of the original partial differential equation) where he 
was free to specify its boundary conditions, provided they meet certain constraints. 
He then showed that a solution to the equation he wished to solve can be found if a 
solution of the new equation can be found. But that equation can be solved—although 
it is very similar to the first one—because the boundary conditions can be suitably 
chosen. Therefore, the original equation can be solved. 


22.3. Darboux on Riemann’s Approach to the Shockwave 
Equation 


To complete the story, we now follow Darboux’s use of Riemann’s method to solve 
Euler’s equation (4.6); I shall consider only the case 3 £ (3’. 
The adjoint equation to that equation is 


Ou Bo Ou B Ou B+’ =6 


OxOy x—yOx x-ydy (x-—y) 


On setting u = (x — y)*+°'v the last term disappears, so the adjoint equation can be 
treated in the form . 
3] fs) ‘0 
: eI Ae, (22.2) 
OxOy x-—yOx x-—ydOy 


Let us denote a solution of this equation by Z((0’, 3), so 
_ B+," / 
v=(@—yyr ZG, 6). 


We already know that x*F(—A, 3’, 1 — \— 3, y/x) is a solution of Eq. (4.6), 
so switching G and (’ gives v = x*F(—A, B,1—2A— B’, y/x) as a solution of 
Eq. (22.2). 

This can be souped up by applying Mobius transformations to x and y into a more 
general solution 


v= (yo = x =x) P OFCA, B,1—A=—f',0), 


(x — xo)(y — yo) 


. Therefore, 
xX — yo)(y — Xo) 


where 0 = 


u = (y — xP (y9 — x) — x9) POF (-A, 8, 1—-A- B',0) 


is a solution to the adjoint equation. 
Still following Darboux, we now seek a solution u of the adjoint equation that is 
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3! 
olin 44Y - ( yY — Yo ) 
Yo — Xo 


yo x 
Yo — x0 i 
series for F reduces to 1, and the factor (xp — x)~*+” that makes u either zero or 
infinite unless \ = —(’. 

Setting \ = —(' we find that 


B 
when x = Xo, and is ( ) when y = yo. When x = xq we have o = 0, the 


u = (yo — x) 8 (y — xP (y — x9) PF (G, 6, 1, 0); 


6 4 
) , and if y = yp then w= (2+) . So 


and indeed if x = xo then u = ( =i 


Yo — Xo 
u is the solution of the adjoint equation that we seek. 


Finally, to solve Eq. (4.6) in the most general form one substitutes this value of u 
in either Eq. (31.32) or (31.33). For example, from (31.33) one deduces 


Pail pal 
Zx0,Yo = (UZ)x,,y, + i Ux, y, fi (x)dx + i Ux, ,y fx(Qy)dy. 
xo ? 


Yo 
Recall that in this notation, f; and fo are two arbitrary functions that depend on the 


boundary conditions for z and ®, g is what is obtained by replacing x and y by a 
and (3 in the function ®(x, y). 


22.4 Telegraphy 


At this point in the second edition of his Legons Darboux showed how to connect 
these ideas to the study of telegraphy, which we considered in Chap. 20. 

He first showed how to deduce a solution to Euler’s equation (4.6) in the case 
when 7 = 2’ from a solution when 3 4 3’. The equation becomes 


&z  BU-8) 
dxdy (x—y?” 


and the solution he obtained in this fashion is 


u = (yo — x) PCy — x)(y — x0) OF (>. BA sae) 


"(x — yo)(y — x0) 


He replaced @ by @ — x in the equation, and let G@ — oo when the equation 
becomes 
az 
—j ba 
OxOy 
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which he remarked “is a transform of the so-called telegraphist’s equation’. 

This allowed him to solve the telegraphist’s equation by what he called indirect 
methods, but he said it was better to apply Riemann’s methods. 

He began with the telegraphist’s equation in the form 


OU OU dU 
a ’ 
Ox2 Or? dt 


where a and (3 are positive constants and U = U(x, t) denotes the potential at time 
t and a distance x. He wrote 
U =e ley, 


and chose units in which the speed of light is unity, and so the equation became 


Cu Ou 
Ueno = 0. 22.3 
Ox? — Or? i aed, 
The characteristics of this equation are 


x+t=const., x —t=const. 


Darboux then continued Sect. 361 as follows: 
This equation is its own adjoint. If u and v are two distinct solutions of it, then 


Ov Ou Ov Ou 
dx Ox? OF 


_ oO wee yout 0 Ov aot 
Ox \ Ox Ox ar \" ot ot) 


Ov Ou Ov Ou 
iG. se) a («S vse) ax 


vanishes on any contour. 


So the integral 


Let us then, as in Sect. 359, form a contour partly composed of characteristics. Let Ox be 
the x-axis and Ot be the t-axis. The characteristics are represented by lines parallel to the 
bisectrices of the axes. If A is an arbitrary point of the plane, we draw through this point 
two characteristics that cut an arbitrary curve in two points (3 and ¥, and we take the above 
integral around the contour AGyA. 


On AG one has dx = dt so one can exchange dx and dt; as a result the corresponding portion 
of the integral will be 


B 
(udv — vdu). 
A 


Similarly, the portion of the integral taken over Ay, on which dx = —dt, will be 


5 
/ (udv — vdu). 
A 
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If one can find a solution V of the equation that reduces to 1 on the segments Af and Ay, 
then the sum of the two previous integrals will be 


2uAa —Ug—Uy, 


and one will have the equation 


T/ Ov Ou Ov Ou 
2ua = ug t+uy, [ («5 ge) a } (5 v5) ax 


in which everything is known as soon as one knows the function u and one of its derivatives 
on the curve K. 


It remains to find the solution v that we have supposed to exist. The preceding arguments 
show us the way, and we are led to look for a solution of Eq. (22.3) that depends only on the 
variable 
(t — to)? — (x — x0)* 

4 
One finds, for this solution, precisely the function J that satisfies the differential equation 


é= 


J/04J'—-J=0, 


which is associated to Bessel functions. 


Let us apply our general integral to the particular case that is the most important in practice, 
where our curve K reduces to the x-axis, and let us denote the coordinate of A by x9. We 


then have ‘ a a 
v u 

2ua = ug +uy dx. 

ua =Uugt+u, [ (« ry sr) x 


Suppose that we are given at the start the potential and its derivative. We then know that at 
the start 


0 
u=f@), 5 = 0). 


ce] 


If we recall that x — ¢ and x + f remain constant on AB and AC, respectively, we will have 


xo+to to xo+lo dv 
Ce Noe ee eee ne : vp(xidr + © fo) eax, 


xo—to 2 xo—to 


a ey 
where v is equal to J(0) and 0 is es 


All the details of the propagation can be deduced from this formula. 


22.5 Exercises 


As with some of my other history courses, there comes a point where the historical 
significance of some of the conceptual developments under discussion is arguably 
obscured by strictly mathematical exercises, which would either be too elementary 
or too hard. This is the case from now on, so only questions will now appear at the 
end of each chapter. 
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Questions 


1. Look back over the solution to the wave equation in the form u,,; = 0. The char- 
acteristic curves are parallel to the x- and t-axes, and the general solution is 
u(x,t) = f(x) + g(t). How much information must be given on the x- and t- 
axes for the solution to be determined in the rectangle defined by the origin and 
the points (a, 0), (0, b), and (a, b)? 

2. Look over the account of Burgers’ equation in Sect. B.2 and then again at Rie- 
mann’s account of the formation of a shockwave. 


Chapter 23 ®) 
The Example of Minimal Surfaces geo 


23.1 Introduction 


Few branches of mathematics have the visual charm of the theory of minimal sur- 
faces, which is one of the areas where analysis and differential geometry most prof- 
itably intersect. The topic was initiated by Euler and Lagrange but advanced only 
slowly until the work of Meusnier in the 1780s. The appropriate partial differential 
equation is difficult because it is non-linear, and it was taken up by Legendre and 
Monge, but left many secrets that were only to be unlocked in the later nineteenth 
century, chiefly by Riemann and Weierstrass. ! 


23.2 Euler and Lagrange 


Many interesting problems in geometry and analysis arise when a function of some 
kind is to be minimised. A geodesic on a surface is a curve of shortest length joining 
two points, and we have already seen in Chap. 7 that problems in the calculus of 
variations can have attractive solutions. Informally, a minimal surface is a surface 
of least area spanning a given curve in space, and many elegant surfaces can be 
obtained by dipping a wire frame into a strong soap solution.” More precisely, a 
minimal surface is a surface with the property that any closed curve drawn on the 
surface encloses a region of smaller area than any other surface with that curve as 
boundary.* So minimal surfaces are the two-dimensional analogues of geodesics 


'For more detail on all of this material, see the forthcoming book by Gray and Micallef on Jesse 
Douglas, minimal surfaces, and the first Fields Medal. 

? As a quick check on Google images will confirm. An hour or two with homemade wire contours in 
various shapes, such as a trefoil not, or two circles (unlinked and linked) will be highly instructive. 


3The curve must be a non-self-intersecting and bound a region of the surface. 
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on a surface. Unfortunately, for reasons that have to do with the difficulty of the 
mathematics and the poor grasp people had of it in the early days, the name is a 
bit of a misnomer. It emerges from the mathematical formulations of the theory that 
they should be called extremal surfaces: they may either be of least area, greatest 
area, or ambiguous in this respect. A precise analogy is with finding a maximum 
or a minimum of a curve with an equation y = f(x). If you only have the ability 
to consider solutions of y’ = 0 you will find the maxima, minima, and horizontal 
inflection points—the extremal points. 

The study of minimal surfaces began in the eighteenth century with Euler, who 
had the idea that they might be interesting, Lagrange, who gave an analytic version of 
the theory in the form of a partial differential equation satisfied by a minimal surface 
given by an equation of the form z = f(x, y), and Meusnier, who gave a geometric 
condition that a minimal surface must satisfy. 

In Sect. V, Sects. 45-47 of his Methodus inveniendi (1744) Euler discussed prob- 
lems involving surfaces of revolution. In Sect. 45, he showed how to find, among all 
curves passing through two points that enclose a given area with an axis, the curve 
that on rotation about that axis generates the solid whose surface has the least area. 
In Sect. 46, he showed how to find, among all curves of a given length and passing 
through two points, the one that generated the greatest solid on being rotated about 
an axis.’ In Sect. 47, he found among the same class of curves the one that generated 
the solid of either greatest or least area on being rotated about the axis. 

In each of these cases, and throughout the book, Euler began with the integral 
expressing the quantity to be maximised or minimised, looked at the variation of 
the integral, and deduced a differential equation that characterised the solution. In 
Sect.45, the solution curve satisfies the equation 


(ny + b)dy 


dx = : 
J —n2)y? — 2bny — b? 


Here b is a constant of integration and n is a parameter that expresses the effect of 
the constraints; it is what has come to be called a Lagrange multiplier. 

Although Euler could certainly have integrated the above expression, he might 
well have found the general solution unilluminating, and instead he considered only 


the special cases where b = 0 (b is a constant of integration), n = 0, andn = —1 (a 
case we shall ignore). 
When n = 0 
bdy 
dx = 


and Euler wrote that “the curve will be a catenary concave to the axis”.> In this case, 
the constraint does not enter the problem, and the Euler-Lagrange equation, as we 


4A facsimile of the relevant pages of Problems 45 and 46 will be found in Nitsche ([207], 6). 
5A catenary, with respect to suitable axes, is given by the equation y = cosh x. 
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would say, for the problem is for the case of curves of any length. But Euler wrote 
not a word about that, and dealt only with the other special case,n = —1. 
In Sect. 47, the solution curve satisfies the equation 


_ cdy 
Vet yr =e 


where b is an arbitrary constant determined by the length of the curve and c is a 
constant of integration, so 
y = —b+ ccosh(x/c). 


The corresponding surface is a minimal surface only when b = 0, a condition that 
Euler did not mention (nor did he remark that when b = 0 this problem coincides 
with Problem 45 in the case when n = 0). He did, however, conclude that the answer 
is a catenary, the minimum area deriving from the case when the catenary is concave 
with respect to the axis, the maximum when it is convex. This surface, in the particular 
case when it is also a minimal surface, was later called the catenoid by Plateau. 

Most books about minimal surfaces credit Euler with being the first to find a 
surface of revolution that is a minimal surface, and refer to these examples. But it 
would seem that he did not, in fact, explicitly address the problem of finding the 
curve that generates the surface of least area on being rotated about an axis, and 
although its solution appears, it does so only as a special case about which he said 
nothing. 

Lagrange improved on Euler’s treatment of the calculus of variations in his essay 
(1761) and, in particular, in Appendix I of the work he tackled the question of minimal 
surfaces in this spirit. 

He wrote down that the area of a piece of surface given by an equation of the 
form z = f(x, y) and spanning a fixed boundary was ae by I : Wdxdy, where 
W = /1+ p? + q? and, as had become customary, p = ae q= = He then argued 


that 
5 | [ waxdy =0@ ff swasay) =0 & 
r) 6 
// ALLS tg: 
W 


This double integral equals 


FY) as 
// (< Decking oe) dxdy, (23.1) 
W ax TW dy 


because, according to the general theory he had developed earlier, 


5 5 Oz Be 
p= ax Paes 
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with a similar expression involving q. Integrating by parts shows that the first variation 
of area vanishes when 


/| (; wy) ™ = (4) axay) 6z =0. 


m (qr) + m (Wr) =o (23.2) 


Therefore 


and Lagrange concluded by remarking that, because of (23.2) 


pdy — qdx 


W (23.3) 


must be an exact differential. 

As he remarked, when p = 0 = q the exact differential condition is satisfied and 
the surface is a plane, but, as he said, this is a very special case (p. 356) “because the 
general solution must be such that the boundary of the surface can be determined at 
will”. He was, however, unable to solve any other cases of Eq. (23.2) and therefore 
he found no other minimal surface, and concluded his account by showing that the 
sphere solves the problem of finding a surface of least area enclosing a given volume. 

So Lagrange had shown that a surface that is the graph of a function z = f(x, y) 
(the surface is then said to be given in nonparametric or explicit form) and which has 
the least area among all surfaces with a given perimeter is to be found as a solution 
of the Euler-Lagrange equation for the area functional: 


2(£)+2(f)=0 an 


This equation is today called the divergence form of the minimal surface equation 
(MSE). Lagrange did not write down the corresponding second-order partial differ- 
ential equation explicitly, most likely because, as we have seen, there was no theory 
of partial differential equations at the time.° It is 


(1+ q?)2xx — 2pqZzxzy + (1 + p*)Zyy = 0. (23.5) 


23.3 Meusnier, Monge, and Legendre 


The first mathematician to provide insight into the geometry of minimal surfaces 
was the 21-year-old Jean Baptiste Marie Charles Meusnier, who was briefly a student 
of Monge. To understand his contribution, consider, as Euler had done before him, 


Lagrange did write it in his (1806, 489), albeit in an ad hoc notation. 
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Fig. 23.1 The helicoid (left) and the catenoid (right) 


the plane sections of a surface that pass through the normal at a given point, P, say. 
Suppose that we consistently choose a direction for these normals as P varies (let 
us call them np), as we may at least for small regions of the surface. At each point 
P, the plane sections containing np cut the surface in curves that pass through P, 
and each such curve (called a curve of section) has a circle in the plane of section 
that best approximates it. We say that this circle has a positive radius of curvature if 
its centre, C, lies on the normal and in the positive direction heading away from P, 
otherwise negative. The signed magnitude PC, the radius of the circle, is called the 
radius of curvature of the corresponding curve of section. A saddle-shaped surface 
will have some curves of section with positive radii of curvature and others with 
negative radii of curvature. Euler showed that it turns out that for almost all surfaces 
(except the sphere) and for almost all points on these surfaces, there are precisely 
two curves of section that have extremal values for the radii of curvature among all 
curves of section with acommon normal. These radii are called the radii of curvature 
of the surface at the point P (Fig. 23.1). 

Meusnier’s contribution was to realise that Lagrange’s partial differential equation 
for a minimal surface was the condition for a surface to have the average of its radii of 
curvature vanish at each point. This average is called the mean curvature.’ The first 
two minimal surfaces were in fact discovered by Meusnier, and they are the helicoid 
and the catenoid. 

In his major paper (1784), Monge introduced the principal curves through an 
arbitrary point on a surface given by an equation of the form z = f(x, y), which he 
defined (in Sect. 22) as the curves along which consecutive normals intersect, and he 
observed that, in general, there are two such curves at each point and they are the 
curves of greatest and least curvature at that point. He went on to give an equation 
for the radii of curvature R: they satisfy the equation gR? + hkR +k* = 0, where 
(as he explained more clearly in his (1787, Sect. 3)) 


g=rt—s’,h=(1+q*)r—2pqst+ (1+ p’)t, and k? =1+ p?+q’. 


7The term was introduced by Sophie Germain in her (1831). 
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Here 
az az a*z az a°z 


~ 9x27 axdy’ Ay?” 


For a surface of zero mean curvature, the radii of curvature R; and R> at a point 
satisfy z + x = 0, and therefore R; + Ry = Oandsoh = 0. This 1s, of course, the 
equation for a minimal surface in the form 


(L+q’)r —2pqs + (1+ p’)t =0, 


or, more explicitly, 


az a*z 


1+p)— =0. 
oan Be are) 


a7z 
1+q’?)— -2 
(la) ar ee 
Monge sought to solve this equation in his long paper (1787).® In Sect. 23 of his 
paper (1787), Monge, after noting that Meusnier had shown that a minimal surface 
was also a surface of zero mean curvature, proceeded to try to solve the partial 
differential equation. He wrote down the equation for the characteristic curves: 
(1 + q?)dy* + 2pqdxdy + (1 + p*)dx? =0. (23.6) 
Factorising equation (23.6) then led to the equation for a curve given by 
dy+tdx =0, 
along which o is constant, and another curve, given by 


dy+oadx =0, 


along which tT is constant, where 


ig ey lp pee —pq-iVl +P? +¢ 
o= , and t = . 


1+4q? 1+¢@ 


(23.7) 


Each of these led to expressions for x and y in terms of p and q from which 
Monge obtained expressions for x, y, and z as integrals of functions involving o and 
t. As can be seen, the characteristic curves are complex, not real, and this raises a 
number of problems. Monge’s hope must have been that in the end a real surface 
could be obtained. 

In Sect. XV of the Applications Monge repeated the analysis of the principal 
curves on a surface that he had given in his (1784), and in Sect. XX he turned to 
the study of surfaces of zero mean curvature. He deduced the equation for a surface 


8The paper was submitted in 1784, but only published in 1787. 
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of zero mean curvature from his equation for the radii of curvature and noted that 
this was the equation that Lagrange had shown defined surfaces of least area (but 
this time made no mention of Meusnier’s name—Meusnier had died in the battle of 
Mainz in 1793). 

Once again Monge proceeded to deduce properties of the minimal surface along 
the characteristic curves. He also presented, as he had before, a second characteristic 
equation: 

(1+ q°)dp* — 2pqdpdq + (+ p*)dq? = 0, (23.8) 


which had been mysterious in his paper but had by now been more properly rederived 
by Legendre. After some algebra, here suppressed, he was able to come up with the 
result that a solution to the minimal surface equation was of the form 


x = —®'(a) + W'(B) and y = —O(a@) + ah'(a) + W(B) — BW'(B), 


where z is found by differentiating these equations to obtain dx and dy and using 
dz = pdx + qdy. The resulting equation can be integrated and the result is that 


co i} ay ee ee ij W"(B) a1 = Bed. 


For Monge and his contemporaries, the problem with this formula was the appar- 
ently ineradicable appearance of imaginary quantities. As we have seen, mathemati- 
cians in the eighteenth century had no problem using formal complex methods in 
real geometrical problems, but the imaginary quantities were required to cancel at 
the last stage so that the solution could be purely real. Faced with a result where 
apparently this could not be done Monge did not proceed any further with analysis 
and his solution did not enable him to find any more examples of minimal surfaces. 

The apparently intractable nature of the solution remained, as Poisson was to note 
in his (1832), 


Monge integrated [the minimal surface equation] in a finite form, but by considerations 
that did not seem to be admissible and involved him in long discussions with Laplace. 
Legendre then obtained the same integral, by means of a transformation applicable to a 
class of second-order equations, which could not leave any doubt as to the exactness of the 
result. Unfortunately one cannot deduce anything from this integral, which is complicated 
by imaginary quantities .... 


23.4 Riemann and Weierstrass 


The topic of minimal surfaces in the nineteenth century is one of the success stories 
for complex function theory, which itself is major new development in the period. 
It also illustrates the power of the theory of linear ordinary differential equations, 
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which makes it worthwhile considering it here, and it was a rich field for differential 
geometers in the decades after Gauss. 

Gauss had shown the value of studying a surface embedded in R? by looking 
at the unit normals at each point (and coherently specifying an outward or positive 
direction). These define a map from the surface to the unit sphere by imagining each 
normal is moved parallel to itself until it is based at the centre of the unit sphere, 
and then associating to each point P on the surface the tip of unit normal, which is 
a point on the sphere. After Gauss’s death, this map became called the Gauss map. 

The first to study the Gauss map of a minimal surface was Ferdinand Minding in 
1850. He showed that it enabled one to transplant the curves of longitude and latitude 
on the sphere to curves which Gauss had earlier shown were highly convenient in 
the study of surfaces. Gauss had suggested that a coordinate grid could be imposed 
on a surface by choosing a curve (usually a geodesic), drawing all the geodesics 
that meet this geodesic at right angles, and then drawing all the curves orthogonal 
to those geodesics. If this is done on the sphere starting with the equator, one first 
obtains the meridians or longitudes, and then the parallels or latitudes. Accordingly, 
on any surface the curves in the first family, the geodesics, are called meridians, and 
the curves in the second the parallels. Minding showed that for a minimal surface 
the inverse of the Gauss map maps meridians to meridians and parallels to parallels. 
This meant, in particular, that orthogonal curves were mapped to orthogonal curves. 

Minding was soon followed by Ossian Bonnet, who considered a different grid on 
a surface, defined by taking at each point the two curves through that point whose radii 
of curvature are the extremal values. These curves are called the lines of curvature 
on the surface. Bonnet showed that the lines of curvature on a minimal surface are 
mapped by the Gauss map onto curves on the sphere that map by stereographic 
projection to two families of orthogonal straight lines. It followed, by Gauss’s paper 
of 1825 on conformal mappings, that the Gauss map of a minimal surface was 
conformal. Bonnet was also able to show in his [20] that the coordinates (or at least 
the z-coordinate) of a map defining a minimal surface are harmonic. 

From conformal maps and harmonic maps to complex analytic maps is in hind- 
sight but a small step, but the examples of the authors just discussed show that it 
may not have seemed that way in 1860. Indeed, it was taken for the first time only 
by the two leading complex analysts of their day, Riemann and Weierstrass, inde- 
pendently. Riemann’s account was entrusted by him to Hattendorff for editing in 
April 1866, but apparently dates from 1860 to 1861. The original manuscript con- 
sists purely of formulae, and Hattendorff supplied a text; the result was published 
in 1867. Weierstrass’s account was first given in a lecture at the Berlin Academy in 
1866. Weierstrass was also the first to give a general account of algebraic minimal 
surfaces. 

Riemann’s approach was to define a piece of surface by a map from a patch of 
R? in ( P, q)-coordinates into IR? in the usual (x, y, z)-coordinates, then to map the 
surface onto the unit sphere by the Gauss map. The area of an infinitesimal piece of 
the surface in R? is related to the corresponding area on the sphere by the Jacobian of 
the Gauss map. In this way, the area of the entire surface is known as a double integral, 
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and the condition that the surface be a minimal surface is that the first variation of 
that integral vanishes. 

This condition reduced to the statement that a certain differential form was exact, 
and from that exact differential form Riemann deduced that it was possible to impose 
isothermal coordinates on the surface in such a way that its Gauss map became 
complex analytic. It followed that the minimal surface was represented conformally 
on the sphere and the plane, and Riemann went on to show that mean curvature at 
each point was zero. 

From this insight, Riemann was able to obtain formulae that parametrised the 


minimal surface: 
. du 1 
x=Re{i n—-—)dlogn (23.9) 
dlogn n 


2 
du 1 
y = Re aan n+ ; dlogyn (23.10) 


: du : 
z= Re -2i dlogn). (23.11) 
dlogn 


These equations are equivalent to those known today as the Weierstrass—Enneper 
equations for the coordinates on a minimal surface, in the form Weierstrass was 
to give that involves only one function, u = u(7) (see Eqs. (23.12), (23.13), and 
(23.14)). However, Riemann did not pause even to notice that he could now write 
down infinitely many examples of minimal surfaces in terms of a function vu = u(). 

Thus, a minimal surface is obtained every time one has an analytic function. The 
deeper question that then arose was to ask for the minimal surface that spans a given 
curve in space; this is the so-called Plateau problem.” 

Riemann tackled this problem by means of a detailed study of the behaviour of 
the Gauss map, but even he found this task daunting, however, and gave explicit 
solutions only for simple boundaries: two skew lines in space (the helicoid); two 
intersecting lines; and a third lying in a plane parallel to the first two, three skew 
lines (which led to a generalisation of the Riemann P-function); the regular space 
quadrilateral (later studied by Schwarz), and two circles in two parallel planes. 

Weierstrass’s approach was different.'° He started with the expression for the mean 
curvature, and assumed that the given piece of surface can be defined by a conformal 
map from a patch U of R? with coordinates p and q into R? with coordinates x, y, z. 


and 


°Tt is named for the Belgian physicist Joseph Plateau who, although blind, showed in a series 
of experiments that the surface of a liquid is in equilibrium if its mean curvature is constant. 
Weightless films, floating in a different liquid, will therefore have zero mean curvature and locally 
satisfy Lagrange’s equation. 

'0See Weierstrass [267, 268], and the paper [270], which was published for the first time posthu- 
mously in his Mathematische Werke, vol. 3. 
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Typically, he gave no references, saying only that it was well known that one could 
do this; Gauss had indicated in his (1828) that this can always be done. He then 
showed that x, y, and z must be harmonic functions of p and q, and so the real parts 
of three complex functions f, g, and h of u = p+ iq, meromorphic on the whole 
of U. The conformality condition on the map implies, in terms of f, g, and h, that 


fu) + guy +h’ uy =0. 


Weierstrass shrewdly saw that this implies that there are functions G(u) and H (uw) 
such that 


f(wW=CG-H’, gu)=i(G? +H"), h'(u) =2GH. 


He then introduced the complex variable s defined by s = and a technical 


(u 
Gu)’ 
: : ; ; Uu 
argument, which I omit, finally led Weierstrass to the complex function G?(u) as’ 
Ss 
which he called S = F (s), upon which he based his analysis. He gave explicit power 
series expressions for the coordinate functions of a minimal surface in terms of S, 
and also for its Gaussian curvature. He pointed out that the principle of analytic con- 
tinuation then allowed the coordinate functions to be defined for the entire minimal 
surface. So he could finally proclaim that to every single-valued analytic function 
there corresponds a surface with mean curvature everywhere zero. 
The parameterisation that does this is known today as the Weierstrass—Enneper 
equations and it can be given in various forms.'! 
After a little work, the Weierstrass-Enneper representation can be given in the 
form 


x=c| +Re | (l-w’)R(w) dw (23.12) 
ao 

y=ot Re f i(1 + o°)R(w)do (23.13) 
wo 

Z=o,+ Re | 20R(w) do. (23.14) 


@0 


Here, x, y, and z are the real parts of f(u), g(u), and h(u). 


'ISee the MIT account, http://ocw.mit.edu/courses/mathematics/18-994-seminar-in- geometry- 
fall-2004/lecture-notes/chapter18.pdf. 
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Weierstrass gave no examples; however, for suitable choices of R various well- 
known minimal surfaces are obtained. For example, setting R = | leads to Enneper’s 
minimal surface, and R (@) = a/2@* to the catenoid. 

The alternative expressions in terms of the functions G(u) and H (uw) are 


x=xj+Re a (G(u)* — H(u)’) du , (23.15) 
y = yo + Re i i(G(u)* + H(u)’) du , (23.16) 
z=zZ+ Re f 2(G(u)H(u)) du . (23.17) 

H(u) 


The connection between these formulae and Riemann’s is given by n = G7 y* 

Because Weierstrass’s accounts were published and taken up by his students before 
Riemann’s posthumous account appeared, most subsequent authors credited Weier- 
strass with discovering the intimate connection between minimal surfaces and com- 
plex function theory. 

As we saw in Chap. 19, Schwarz was the student of Weierstrass who remained 
most closely associated with the master. This is particularly true of his earliest work, 
as one would naturally suppose. In 1865, Weierstrass gave a seminar at the University 
of Berlin on minimal surfaces. Perhaps as part of the seminar, Schwarz took up and 
solved the problem of finding the minimal surface bounded by a space quadrilateral, 
unaware that this problem had already been solved by Riemann. Schwarz’s mathe- 
matical solution was presented to the Berlin Academy by Kummer, accompanied by a 
Gypsum model that Schwarz, it seems, had made himself, based on experiments with 
glycerine. On the strictly mathematical side, Schwarz’s analysis made considerable 
use of the theories of elliptic and hyperelliptic functions, because it chose a particular 
hyperelliptic function for substitution into the Enneper—Weierstrass equations. This 
gave his surface a natural periodicity: pieces of it could be fitted together to form 
an annular region with two boundaries, and these pieces could in turn be joined up 
to form an infinitely extended surface with infinite topological genus. Altogether a 
remarkable discovery with which to embark on a career in mathematics (Fig. 23.2). 

A deep insight into analytic functions underpinned this work. This was the recog- 
nition that if a straight line lies in a minimal surface then the surface has that line as 
a line of symmetry. In other words, and to give the highly plausible physical moti- 
vation, if two pieces of minimal surface meet along a common line and have the 
same normals there then each is the analytic continuation of the other. This is the 
origin of the Schwarz reflection principle, which Schwarz went on to prove in 1869. 
Interestingly, Schwarz wrote that he had learned this insight from a conversation 
with Weierstrass. 
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Fig. 23.2 A piece of 
Schwarz’s periodic minimal 
surface, S. Schwarz, 
Gesammelte mathematische 
Abhandlungen, vol. I, facing 
p. 132 


23.5 Simple Solutions of the Plateau Problem 


As noted above, the Plateau problem asks for a minimal surface that spans a given 
curve in space. In general, given a piece of a minimal surface bounded by a curve 
in space, one has no knowledge of the behaviour of the Gauss map on the boundary 
curve. But if the boundary curve contains a straight line then along that line the 
image of the Gauss map can only be an arc of a circle on the image sphere (although 
any part of it may be covered more than once). For this reason, attempts on the 
Plateau problem were confined to polygonal curves in space, because the image of 
the boundary under the Gauss map is made up of arcs of great circles on the sphere. 

The polygon is specified by giving the coordinates of its n vertices. So the problem 
is a three-dimensional version of the Schwarz—Christoffel problem, which was under 
investigation at the same time. 
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Accordingly, the solution to the Plateau problem might then be expected to come 
in two parts. The minimal surface is realised by a map from the upper half line 
that maps the real axis to the polygonal boundary curve. In the first part, we take 
arbitrary points z;, Z2,..., Z, on the real axis and look for a holomorphic map with 
branch points at these points of the correct order, so that the map will send the upper 
half-plane onto a polygon with the required angles at each vertex, but not necessarily 
a polygon of the right size and shape. In the second part, we fix the positions of the 
points Zz], Z2,..., Z, on the real axis so that the map is onto the given polygon. The 
side lengths are given by certain integrals, and the task is to use that information to 
determine the right values of those points on the real axis. 

In a note (the “Fortsetzung’’) added to his first paper on minimal surfaces, Weier- 
strass approached the problem through the theory of linear differential equations, 
and tried to define the equation that the functions G(u) and H(u) must satisfy. !* 
It is trivial that two functions satisfy a second-order linear differential equation, the 
important question is what can be said about the coefficients. Weierstrass claimed that 
the equation satisfied by the functions G and H has rational coefficients that depend 
on the lengths and directions of the boundary segments. However, “In general’, he 
said, 

the determination of the constants in this differential equation, such as the constants of 

integration, depend on the solution of transcendental equations; but it is not difficult to find 


special cases where one can find complete expressions for G(w) and H (u) in terms of known 
functions. 


Weierstrass concluded by promising a full report at a later date, but this was never 
given. All we have is the “Bestimmung”, which is Schwarz’s account written in 
the 1890s of what Weierstrass had in mind when giving that two-page report to the 
Academy of Sciences in December 1866. 

As Schwarz reconstructed it, Weierstrass’s approach had been to consider what 
had to happen at each vertex of the polygon. The polygon has n angles at the cor- 
responding vertices of a, @2,...,@,. Moreover, each angle defines a plane whose 
orientation is captured by the normal to the angle at the vertex. Schwarz claimed 
that by working in this way Weierstrass had found some important facts about this 
differential equation. In particular, he had deduced that at each branch point the 
equation—regarded as an equation for a function y(t)—has a simple pole for the 
coefficient of a and a double pole for the y-coefficient. These are the only singular 
points, and the point at infinity must be what is called an inessential singular point 
for the differential equation.! 

This argument fails to determine the required differential equation completely, and 
as aresult it seems to have been abandoned. First, as Weierstrass had indicated, it is 
necessary to show how the pre-images of the vertices on the boundary are determined, 
but that problem was not discussed. Second, it says nothing about the important role 


!2] azarus Fuchs was developing the theory of these equations under his influence at the time. 

'3 An inessential singular point is one where the solutions of the differential equation remain holo- 
morphic in a neighbourhood of the point even though at least one of the coefficients of the differential 
equation is singular. 
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played by the branch point of the Gauss map in the interior of the minimal surface. 
But these are very difficult problems. Indeed, as late as 1914 Darboux remarked that 
“Thus far, mathematical analysis has not been able to envisage any general method 
which would permit us to begin the study of this beautiful question”. 


23.6 Exercises 


Questions 


1. Find examples on the web of some of the best-known minimal surfaces: the 
helicoid, the catenoid, and Enneper’s surface. 

2. On the (correct) assumption that any closed curve spans a minimal surface, find 
examples of minimal surfaces that are topologically Mobius bands. 


Chapter 24 M®) 
Partial Differential Equations and cio 
Mechanics 


24.1 Introduction 


Several issues arise here. Does mechanics start with equations of motion, or with 
principles like the principle of least action? Is the Lagrangian formulation, although 
completely general, always the best, or can there be others? Is there a worthwhile 
parallel between mechanics and geometry? 

This is a tough chapter, so let me spell out the route through it, and say what is 
essential. First, I present a clean, modern version of the key mathematical ideas.! 
Given a mechanical problem expressed in terms of a Lagrangian (see Sect.7.6) in 
generalised coordinates gq; and qj, it can be represented in terms of more symmetric 
coordinates g; and p; and a new function called the Hamiltonian, which is closely 
related to the Lagrangian (Eq. (24.1)). Important here is what are called Hamilton’s 
equations (Eq. (24.2)). 

Hamilton had the brilliant idea of looking at what happens to the time evolution of 
a Hamiltonian system. Now the upper end point of the Hamiltonian (or Lagrangian) 
is allowed to be a function of time, and the paths of the gs continually obey their 
Euler-Lagrange equations. This gave him a function W of the upper end points 
and the time, and it satisfies a particular first-order partial differential equation (see 
Eq. (24.3)) that became known as the Hamilton-Jacobi equation after Jacobi saw 
more deeply into it. 

If we suppose that the values q1, g2,...,@n define a point (qi, g2,.--, Gn) ina 
space we can call Q, then it turns out as time goes by in a Hamiltonian system these 
points define a moving hypersurface. 

Moreover, solutions of the Hamilton—Jacobi equation satisfy one half of Hamil- 
ton’s equations, thus making a connection between a first-order partial differential 
equation and a family of first-order ordinary differential equations. Indeed, the q’s 
as they evolve describe the characteristic curves of the partial differential equation. 


'This one follows www.damtp.cam.ac.uk/user/tong/dynamics.htm, which is David Tong’s Cam- 
bridge notes. 
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This is what Cauchy appreciated (see Chap. 12), and it opened up, surprisingly, a 
way of solving some systems of ordinary differential equations by solving a partial 
differential equation. 

This being a history course, after presenting these ideas I document that Hamil- 
ton did possess them in his own way, that Jacobi re-interpreted them—and that he 
had scathing views about much of what had been said about variational principles 
before him—and that the ordinary differential equation—partial differential equation 
connection just mentioned was soon appreciated. 


24.2 Hamiltonian Dynamics 


Let us suppose we have a dynamical problem expressed in terms of a Lagrangian. I 
shall write L(q, q, t) for the Lagrangian, where g stands for an n-tuple in variables 
41> 92, --+>4n and gq for the corresponding n-tuple q1, G2, ... , qa—we can get a long 
way by thinking of n = 1. I shall write A rather than 6 for the variational symbol, to 
avoid confusing 5 and S, a symbol to be introduced shortly. I shall write r° for the 
initial time and r! for the final time. 

As we shall see, William Rowan Hamilton defined a function, which he called 
the principal function, as the time integral of the Lagrangian: 


t! 
S= i: L(q,q, t)dt. 
19 


A standard variational argument shows that 


t! 
AS = d AL(q, q, tdt 
19 


"(aL aL. 
» \dq dq 
Note that this is a system of n equations, one for each g;, j = 1,...,n. 
We integrate the second term in the integral by parts and obtain 


as= ae Aqdt + ay : 
Jo \0qg diag) 4 Eee 
If the end points are fixed then the term in square brackets vanishes. So the variation 


of the integral vanishes if the integrand vanishes for all Ag, and the result is the 
Euler-Lagrange equations: 
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OL d OL 
— —-——, j=l,...,n. 
dqj dt 0qj 
Following Hamilton, we define 
OL 
so 
dp; = = OL 
dt °! aq; 
We now define the Hamiltonian 
H =H, p,t)=)_ pjqj-L. (24.1) 
J 
Therefore 
F OL 0H 
Pj _ =1, i, 
1 aq , 0qj 
and we find 
: OL oH 
qd 3 = 
‘ap ji Op; 
but L is not a function of the p; and so a = 0 and so 
. 0H 
These equations 
j oe dq as (24.2) 
pj =-— andq; = — : 
7 8qj 1 8p; 
for j = 1,...,n are called Hamilton’s equations or the canonical equations. The 


solution to these equations resolves the dynamical problem at stake. 
They have a simple, if perhaps surprising corollary that turns out to be useful. We 
calculate ut 
=p ey te 
dq; dt dp; at ot 


But by the canonical equations, the RHS equals a sum of terms of the form 


.dqj | .dpj | dH 
dt nd dt dt’ 
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so the summations cancel, and it follows that 


dH 0H 
dt ot 


0H 
In other words, if the Hamiltonian H does not involve t explicitly—so aes o— 


: dH : ee as 
then AH is cone, = 0. In conservative systems, the Hamiltonian can be 


treated as the energy. 

Now we turn to a different approach, one that often turns out to have comple- 
mentary virtues. Hamilton had the ingenious idea of investigating what happened to 
the path given by the Euler-Lagrange equations as the upper end point varied and as 
the time taken varied. He may well have done this because he was interested in the 
passage of light through crystals. So in what follows the g;(¢) that define the path 
satisfy the Euler-Lagrange equations. 

He now regarded his principal function 


t! 
s=f L@q, 4, t)dt 
0 


as a function of its upper endpoint ¢; = T and the value of g varies. The initial values 
of the coordinates q° are fixed. We have jae L. Suppose that we fix a value of 


t! = T. Then we can define a function 
W=W(q".q',T) = S(q(7)). 
The distinction between these two functions is that S is defined on any path but W 


is a function of the end points and the time the system has been evolving. 
A variational argument now gives that 


ae a) ae 2 ry Se 
AS = Seesoe AYE | a | 
0 oq dt 0q oq 10 


but now we deduce that 


where the term on the right is evaluated at t = T. But 


AS = ow 
ag 


so 
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aw, 
igi 


ow 
Now we allow T to vary, and calculate aT: We have 


dW aW dWdq' aw ,., 
ar Or * Get-at ay 


dS 
We have aT = L, so, using Eq. (24.1), 


aw 
— =—H(q', p',T). 
aT (q,p.,T) 


But the upper end points and the time are the only variables involved, so we can 
drop the labels and write 


aw aw 
W=Wqq,t), = /p, = —H(q, p,t). 
oq ot 


The last equation can be written in this form 


OW iy («. at). (24.3) 
ot oq 


In this form, it is called the Hamilton-Jacobi equation. It has the property that W 
does not appear explicitly, so if W is a solution then so is W +c, where c is an 
arbitrary constant. 

It remains to show that the solutions of this equation solve half of Hamilton’s 
equations, i.e. that 


dH, ee 
on the assumption that we have aos q—these are taken as part of the initial 


conditions. (At each point when t¢ a there is a unique curve that obeys the Euler— 
Lagrange equations and the condition just assumed. As we shall see in Chap. 25, this 
is closely analogous so the existence of a geodesic on a manifold through a given 
point in a given direction, and this in turn derives from the fact that the geodesic 
equation is a second-order equation.) 
We write 
ow 
= 37° 


sO 
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daw #W. aw 
dt 0q dq0q dqot 


P 
fl : a 
The second term on the RHS is — aa (q, p, t), and differentiation here of p with 
q 
respect to q is required because p = aa So we get 
q 


WwW. dH dH WwW 
~ aqoq? 9q ap dqdq" 


p 


So we finally obtain 


as required. 

So now, if we can solve the Hamilton-Jacobi equation we can solve the canonical 
equations—and if we can solve the canonical equations we can solve the Hamilton— 
Jacobi equation. 

If we now suppose further that H does not depend explicitly on ¢ then we may 


write Eq. (24.3) in the form 
( =) as 
A\q,—)+— =0, 
aq 


for which the general solution will be of the form 
S(q,a,t) = W(q,a@) — aot. 


With a little more work (but you might think there’s been enough) and on the 
further assumption that H is the total energy, and therefore a constant of the motion, 
it can be shown that we can therefore think of the surfaces $ = const. as wave fronts 
moving in Q, and their progress analysed in some detail. 

We shall investigate the solutions of the Hamilton—Jacobi equation in Sect. 24.4, 
but we now turn to look at Hamilton’s original paper and at Jacobi’s comments on it 
and the theory out of which it grew. His comments are enjoyably fierce. 


24.3. Hamilton’s and Jacobi’s Theories of Dynamics 


I first note an annoying sign convention: what we write as U, the potential energy of 
a system of particles, was denoted —U in the nineteenth century. To minimise con- 
fusion, I shall sometimes write U = —U to denote potential energy with the old sign 
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convention. So where we write L = T — U for the Lagrangian, these mathematicians 
wrote L = T + U; in each case, T denotes the kinetic energy of the system. 

Hamilton took up the topic of dynamics, the motion of a system of particles under 
their mutual attraction, in 1834, when he wrote his essay “On a general method 
in dynamics”, and he wrote a “Second essay” on the subject a year later. We shall 
concentrate on it.” 

In this essay, he worked from the start in generalised coordinates 7;, and intro- 
duced the new coordinates 7; = on Here T is expressed with respect to the ; and 
nj; a8 a function of 7; and w; the same function is denoted F. Hamilton wrote down 
the expressions for varying T and varying F and deduced that 


a(F-U) dF _, aF aT 
Iw; 7 Iw; 7 


because the potential function U does not depend on w;. So Lagrange’s equations 


take the form 7 
dw; a(U-—F) 
dt = anj_ 


Hamilton set H = F — U and obtained the equations 


dnj 0H dw; 0H 
dt an; 


These later became called the canonical equations. This is the first appearance of the 
canonical equations, although important steps in that direction had been taken by 
Poisson in a paper Hamilton cited; Hamilton’s improvement was the introduction of 
the w je 

Hamilton then reintroduced the principal function S as 


t ws 
s= | (7 +0)at, 


which says that the principal function S is the time integral of the Lagrangian. A 
calculation of the variation of S enabled Hamilton to show that 


as 
—+H=0 
Ot 
(or rather, a set of equations equivalent to that result). 
At this point, Fraser and Nakane make several valuable observations. They note 
([{108], 184) the new derivation did not assume the conservation of mechanical energy, 


>This account closely follows [108] which can profitably be consulted for many interesting insights. 
They look in some detail at the first essay. 
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a fact that Hamilton did not notice—he repeatedly stated, indeed, that his method 
was restricted to cases where that law applied. 

Moreover, as it happens, the variational conditions often imposed by Lagrange are 
the vanishing of the “action” and fixed end points for the curves under consideration. 
This can have the effect that the only curve considered is the minimiser itself—there 
are no other candidates! Hamilton allowed the end points to vary and so produced a 
genuine family of candidate curves. 

Finally, they remark that Hamilton also noticed that if the variation of S vanishes 


then 
oT : é d oT oar au 
6S = sen) | -f[ ~ | nj; } at. 
Pes Ty Jo »» dt dnj anj an)” 


If the end points are fixed then the first sum becomes zero and the other term is 
simply Lagrange’s equations. 


24.3.1 Jacobi 


Carl Gustav Jacob Jacobi must have read Hamilton’s papers in 1836, because he 
wrote to his brother Moritz about them in September 1836 to say that they had led 
him to make a deep study of dynamics.* Although Jacobi had a genuine interest 
in mechanics, and had recently made an important breakthrough in the three-body 
problem, his reworking of Hamilton’s ideas was much more that of an analyst. 

He established carefully that the principal function S is a function of the variables 
(x;, yj, Z;), the initial positions (a;, b;, c;) and t. Then he wrote down the expression 
for its variation, and deduced that 
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an equation he called Hamilton’s equation. 

A feature of Jacobi’s derivation was that it made clear that the forces could 
depend explicitly on the time. In principle, this extends the study of dynamics to 
non-conservative systems in which energy is lost, but Jacobi lost interest in this in 
the 1840s, and it was for his successors to appreciate this advance on the work of 
Hamilton. 

Less clearly, and we cannot discuss this point fully here, Jacobi’s method is not 
wholly variational because he did not discuss the variation of the end points. This 


3See Koenigsberger ([162], 198), quoted in Hawkins ([139], 205) and Pulte ((230)]). 
4The same equation appears in [135]. Jacobi did not call S the principal function, however. 
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is not a problem for Jacobi’s presentation, because the derivation of “Hamilton’s 
equation” is entirely a piece of partial differential equation theory. 

Jacobi had various criticisms of Hamilton’s work that we cannot go into here and 
for which the reader is referred to Fraser and Nakane [108]. It says a lot about the 
growing place for mathematics in Germany (and, of course, France) and its uncertain 
place in Great Britain even with a mathematician of Hamilton’s talents, that there is 
much justice in Nakane and Fraser’s concluding remarks (p. 220): 


Hamilton was the great creator, and it is unimaginable that Jacobi could have reached the 
level of remarkable abstract insight that he did without a foundation already in place. Jacobi 
nevertheless had a better knowledge of contemporary analysis and a better sense for how 
the new ideas should be developed at an appropriate theoretical level within the calculus of 
variations, mathematical dynamics and differential equations. He possessed as well a talent 
for making the new ideas accessible to receptive mathematicians. Although he died some 
fifteen years earlier than Hamilton, his posthumous Vorlesungen would become the most 
influential work in the history of mathematical dynamics since Lagrange. 


With this in mind, it seems worthwhile to give some quotations from Jacobi’s 
posthumously published lectures on dynamics, starting with an extract from the 
sixth lecture.° 


We now come to a new principle which, unlike earlier ones, does not give an integral. This 
is the “principle of least action” incorrectly called that of least work. Its importance lies first 
in the form in which it presents the differential equations of motion, and second in that it 
gives a function which is minimized when these differential equations are satisfied. Indeed, 
such a minimum does exist in all examples, but the reason for this is unknown. Whereas 
the interest of this principle consists precisely in the fact that one can generally construct a 
minimum, formerly too much importance was attributed to the existence of such a minimum. 
An example of the principle under consideration comes from Euler’s “de motu projectorum’’. 
After ...proving the principle for attraction to fixed centers, he did not succeed in extending 
it to the n-body problem, for which he did not know the principle of kinetic energy; he 
contented himself with stating that the computations were very lengthy. But Euler said that 
the principle of least work had to be valid also here, since the fundamental results of a sound 
metaphysics revealed that forces in nature always do the least work. 


However, neither a sound nor any metaphysics shows this, and indeed Euler was led to this 
expression only through misunderstanding of the name “least work.” Maupertuis meant that 
nature achieved her work with the least expenditure of force, and this is the true meaning of 
the “principle of least action.” 


In my opinion, this principle is presented incomprehensibly in all textbooks, even in the best, 
those by Poisson, Lagrange, and Laplace. Namely, it is stated that the integral [ }* m jvjds, 


Ss 


(where vj = ae is the velocity of the point m ;,) is minimized, when taken from one position 
of the system to another. To be sure, this is only stated to be valid for conservative systems, 
but it is forgotten that one must eliminate time from the above integral and reduce everything 
to space elements. Moreover, this integral must be understood to be a minimum for given 
initial and final configurations and all possible paths joining them. 


In his eighth lecture, Jacobi obtained the Lagrangian form of the equations of 
dynamics, and wrote 


>They were published in 1866. Compare the extract in the Birkhoff Source Book, 374-379, which 
gives some of the technical material as well. 
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In place of the Principle of Least Work, one can substitute another principle, which also 
consists in the vanishing of a first variation, and which can be derived from the differential 
equations of motion even more simply than from the Principle of Least Work. This variational 
principle seems to have been unnoticed previously, because — in contrast to the Principle of 
Least Work — it does not correspond to a minimum principle. Hamilton was the first to use 
this principle as a point of departure. We will use it to set down the equations of motion in 
the form given by Lagrange in his Mécanique analytique. 


Now let X;, Y;, Z; be the partial derivatives of a function U , and let T be half the living 
force [kinetic energy], that is, 
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then the new principle is 
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This principle is equivalent to the Principle of Least Work, but more general in that t may 
enter U explicitly, which is excluded in the previous Principle [of Least Work], because in 
the latter time must be eliminated using the theory of living force [conservation of energy], 
and this holds only when ¢ does not enter U explicitly. We will use equation (1) to derive the 
differential equations of motion from a first-order partial differential equation. As Hamilton 
showed, one can decompose the variation in (1), using integration by parts, into two parts, 
of which one is outside and the other inside the integral sign, and both must vanish. In this 
way, the integrand, which equals zero, gives the differential equations of the problem, and 
the expression outside the integral sign gives its integral equations. 


The complete statement of the new principle reads as follows: Let the configurations of the 
system be given at a specified initial time fo and final time 7;. Then the actual intervening 
motion is determined by the equation 6 uk (T + U)dt = O of (1). 


Here the integral is taken from fo to f1; U is the force function and can contain the time 


explicitly, and T is half the living force. 
Jacobi then derived the equations 
dpj a(T+U) ar 
dt ~ oq j a aq j ; 


In his ninth lecture, Jacobi then obtained the Hamiltonian form of the equations 
of dynamics. He first obtained the equations in a form Poisson had presented them 
in 1820: 


dp; aT+U) aT aU 
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dt dq; dq; qj 

in which ie depends only on the g; and ‘ai is a homogeneous quadratic function of 

the q; and therefore of the p;. This yields the equations 


dpj dqj 
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dt y dt 2; 
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about which he remarked 


This is Poisson’s form of the equations of motion, where the Q ; and P; contain no variables 
other than the ps and the gs. This system of 2k equations has the following noteworthy 
properties: 
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Of these, Poisson (loc. cit.) writes down those of the first group, while the rest can be written 
down directly from his results. Equations (2) show that the quantities Q; and P; are to be 
recognized as the partial derivatives of a single function of the p; and —q;. This observation, 
which comes directly from equations (2), Poisson does not make; still less does he try to 
use this function. It was Hamilton who first expressed it, and through the introduction of 
his characteristic function the whole reformulation is made extraordinarily easier. One can 
reach the same conclusion almost by one’s self, if one derives the kinetic energy theorem 
from the second Lagrange form of the differential equations given in formula (9) [of the 
Eighth Lecture]. 


24.4 First-Order Partial Differential Equation Theory 


None of this would be much use unless the actual equations could be solved, and many 
authors comment explicitly on the utility of the Hamilton-Jacobi equation. After 
all, one’s experience is that partial differential equations are usually hard to solve, 
whereas ordinary differential equations are usually easier. So, given a first-order 
partial differential equation, one looks for the system of characteristic equations, 
which are ordinary differential equations, and hopes to solve them. But, as Hilbert 
and Courant commented (Vol. 2, 107) 


Hamilton and Jacobi achieved a major success by recognizing that this relationship may be 
reversed. To be sure, the integration of a partial differential equation is usually considered 
as a problem more difficult than that of a system of ordinary differential equations. In 
mathematical physics one is often led, however, to a system of ordinary differential equations 
in canonical form. These equations may be difficult to integrate by elementary methods, while 
the corresponding partial differential equation is manageable; in particular, it may happen 
that a complete integral is easily obtained, e.g., with the help of the separation of variables 
(cf. Ch. I §3). Knowing the complete integral, one can then solve the corresponding system of 
characteristic ordinary differential equations by processes of differentiation and elimination. 
This fact, which is contained in the earlier results of §4 and §8, can be formulated in a 
particularly simple way for the case of canonical differential equations and can be verified 
analytically, independently of the motivation [...]. 


Now, the Hamilton—Jacobi equation is a first-order partial differential equation 
in n variables q,,...g, and we expect to find a system of n first-order ordinary 
differential equations for it which are the characteristic curves. In fact, these equations 
are precisely the equations q; = H,, that we took on trust earlier. So we see a very 
strong connection between the theory of first-order partial differential equations and 
the Hamilton-Jacobi theory of dynamics. Curiously, it is not clear how much of 
Cauchy’s theory was known to either of these men at the time. 
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A comment of the historian Tom Hawkins provides a helpful conclusion to this 
difficult mathematics®: 


For Hamilton the importance of the equivalence lay in the direction of replacing the equations 
of motion by the two partial differential equations so that thereby the difficulty of determining 
the motion of a system of masses “‘is at least transferred from the integration of many 
equations of one class to the integration of two of another” [1834]. Jacobi realized that, at 
least in terms of the integration theory of first-order partial differential equations, which 
Hamilton does not appear to have had in mind, “reduction” [...] is hardly an advance. As 
Jacobi explained in a letter to the secretary of the mathematics and physics section of the 
Berlin Academy: “Little would seem to be gained by this reduction to a partial differential 
equation since according to Pfaff’s method ...— and for more than three variables till now 
nothing further was known about the integration of partial differential equations of the 
first order — the integration of the one partial differential equation to which the dynamical 
problem is reduced is much more difficult than integration of the directly given system 
of ordinary differential equations of motion.” He went on to explain, however, that “if 
Hamilton’s investigations are extended to all first-order partial differential equations, as can 
be done without difficulty, it is on the other hand a significant discovery in the theory of 
first-order partial differential equations that they can always be reduced to a single system 
of ordinary differential equations, which previously according to the Pfaffian method was 
insufficient” (1837a: 50-51). 


Here Jacobi was referring to his discovery that the problem of determining a complete 
solution to the general first-order partial differential equation 
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reduces to the complete integration of a single system of ordinary differential equations [...] 
of order 2n — 1, which is in fact the first system that arises in the application of Pfaff’s 
method (1837b: 101-102). 


24.5 Exercises 


Questions 


1. An abundance of daily experience suggests that Euclidean geometry is true, and 
adequately axiomatised. What would have to be done—a question Jacobi asked 
when giving his courses on mechanics—to establish an axiomatic account of 
everyday mechanics? 


®See Hawkins ([139], 206). 


Chapter 25 Mm) 
Geometrical Interpretations cies 
of Mechanics 


25.1 Introduction 


One of the most important developments in the application of the calculus of varia- 
tions was the exploration of the close analogy between Hamiltonian dynamics and 
Gaussian differential geometry, and it is to this that we now turn. These connections 
have been carefully explored by the historian Jesper Liitzen in his [193], and we 
follow his account here.! The chain of ideas is as follows: 


e Gauss introduced the idea of geodesics on a surface and coordinate systems largely 
made up of geodesics (compare latitude and longitude on a sphere). 

e Liouville mimicked these arguments in the context of mechanics but without 
imposing a geometrical interpretation on mechanics. 

e Lipschitz interpreted a mechanical trajectory as a geodesic on a surface (or higher 
dimensional analogue) and thinks of mechanics in geometric terms 

e Darboux wrote all this up carefully and lucidly: mechanics can be studied geo- 
metrically. 


25.2 Gaussian Curvature 


In his Disquisitiones circa superficies curvas (or, General investigations of curved 
surfaces) [116] that created the subject of intrinsic differential geometry, Gauss 
introduced the idea of a surface as either the image of a map from a patch of R? to 
IR? or a domain with coordinates (p, g) and a metric 


ds’ = Edp* +2Fdpdq + Gdq’, 


where F, F, and G are functions of p and q. 


'See also Liitzen ([192], Chaps. XVI, XVID. 
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He then imposed a coherent logic on what seemed at times to be a sprawl of 
formulae related by arbitrary changes of variable. In particular, he defined a concept 
of curvature of a surface and showed that it was intrinsic to the surface. That is, it 
could be defined entirely without reference to any ambient space in which the surface 
lived (such as, say, a normal to the surface implies). As Gauss puts it, if one surface 
can be mapped isometrically onto another then the values of the curvature agree at 
corresponding points. This was an entirely novel and unexpected idea (Gauss called 
the corresponding theorem the exceptional theorem, Theorema egregium), and it 
gradually transformed the subject of differential geometry. 

Gauss defined his measure of curvature by means of the Gauss map (see 
Sect. 23.4). He defined the (Gaussian) curvature at the point P as the limit 


. areaof S’ 
lim ————_, 

S>P areaof S 
where S is a region about the point P and S’ is its image under the Gauss map. If the 
surface S is a plane then all the normals point in the same direction and the Gauss 
map sends the entire plane to a point: the plane has Gaussian curvature zero. The 
Gauss map sends a circular cylinder to a line, so the curvature of a cylinder is also 
zero. The Gaussian curvature of a sphere of radius R is easy to find: directions on 
the domain sphere are scaled by a factor of 1/R by the Gauss map, so the Gaussian 
curvature is 1/R*. Lastly, saddle-shaped regions have negative curvature, as can be 
seen from the figure of the catenoid, Fig. 23.1. 

Gauss showed that the value of the (Gaussian) curvature at a point was always 
the product of the extremal radii of curvature at that point. So, writing K for the 
Gaussian curvature and k,, k for the radii of curvature at a point, one has K = ky.ko. 
In particular, if the surface is a minimal surface, its mean curvature vanishes and the 
radii of curvature at a point are +k (they will vary with P) then one has K = —k?. 
So a minimal surface has negative Gaussian curvature. 

Gauss’s reformulation of differential geometry spread slowly across the mathe- 
matical community, and because its deepest discovery concerned the existence of 
intrinsic properties of surfaces applications of it came only slowly to the study of 
minimal surfaces, which belong in extrinsic geometry. 

To study geodesics on such a surface, Gauss noted that one can impose an analogue 
of polar coordinates on a surface, by choosing an arbitrary point O as origin, an 
arbitrary geodesic as the base line, and assigning the coordinates (7, y) to the point 
that is a distance r along the geodesic that meets the base line at the angle y. He then 
proved that the curve defined by the points a given distance rg from O is everywhere 
at right angles to the geodesics through O. In a system of polar coordinates, E = 1, 
F = 0, and G must be positive so that the metric is positive definite. 

He then investigated how to change a given coordinate system to a system of polar 
coordinates, and obtained the equations 
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from which r can be determined, and 
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from which, when r is known, y can be found. He observed that these equations can 
certainly be solved, and indicated that power series solutions would be particularly 
interesting. In this way, geodesics are found that are the orthogonal trajectories to 
the curves r = const. 


25.2.1 Liouville’s Contributions 


Gauss’s ideas about differential geometry were brought to France by Liouville and 
Bonnet. Liouville was a particularly versatile mathematician and an editor of a 
journal he had founded, which put him in a good position to know what was going 
one generally, and one of his contributions was to find an important way in which 
differential geometry connected to the study of partial differential equations. 

In the 1840s, one of Liouville’s concerns was to acquaint his fellow French math- 
ematicians with the work of Gauss on differential geometry.” One of the ways he did 
this was by re-issuing Monge’s Application d’analyse a la géométrie, to which he 
added a series of notes, remarking that the notes 


deal with those points ...for which Mr. Gauss has opened new ways; besides our aim is to 
indicate to young people the sources where they can find information rather than giving them 
regular lectures. 


One result rather sketchily proved by Gauss—although one can legitimately won- 
der about Liouville’s argument, as we shall see—was that a surface given in the form 
(x(u, v), yu, v), Z(u, v)) and with a metric of the form ds? = Edu* +2Fdudv + 
Gdv’ can be given a new coordinate system a and 3 for which the metric takes the 
form ds? = X(a, 3)(da? + dp’). 

Such a system of coordinates is called isothermal, and its advantage is not only that 
itis easier to calculate with but that the coordinate curves a = const. and 3 = const. 
meet everywhere at right angles. This condition on two families of curves occurs 
naturally in differential geometry, for example, the principal curves on a surface 
meet at right angles.* 


>This account follows [192], the definitive biography of Liouville, see pp. 739-747. 
3 Away, that is, from what are called umbilic points. 
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To prove this claim, Liouville formally factorised the metric as a product 
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Liouville now supposed that there was an integrating factor 4 + iy that makes the 
first factor an exact differential, which he wrote as d(a + 13). Multiplying the second 
factor by the integrating factor 4 — iv makes it exact and equal to d(a — i/3), and 
multiplying the two factors yields 


(uw? + v?)ds? = da’? + dB’, 
which Liouville wrote as 
ds’ = Na, B)(da? + dB’), Ma, B= (wt). 


The weak point in this argument is the existence of an integrating factor, which 
had been proved in the real case by an argument that does not extend to the complex 
case that Liouville was dealing with without more thought than he gave it. In fact, 
that point had been dealt with by Cauchy in 1819, but his result was not well known 
and it can seem that even Cauchy had forgotten it; in any case, Liouville did not 
mention it. 

Gauss’s great discovery had been the intrinsic nature of the curvature K of a 
surface. Liouville offered what he believed was a simpler proof of this result, by 
showing that K satisfies a partial differential equation in terms of A, which is an 
intrinsic quantity: 


K= 
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It would take us too far afield to give his proof, but we can observe the uses of this 
result. For example, Liouville showed that a surface of zero curvature can be mapped 
isometrically onto a plane by solving the equation 


0? log X %e 0 log X = 
Oa? Op? 


As for surfaces of non-zero curvature, those of constant curvature stand out as 
the first to analyse. If the curvature of such a surface is, say, a then A satisfies the 
partial differential equation 


0 log X a 0 log X pe 3% 
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an equation that has become known as Liouville’s equation. It is more often written 
today in the form 

Oz x Oz __2e? 
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where e* = X. 
Liouville tackled this equation by introducing complex coordinates 


a+tif~=u, a—-iP=v, 
which enabled him to write the equation as 


AlogA | A _o 
Oudv ~ 2a2 


In Note 4 of the re-edited book by Monge, Liouville merely stated the solution 
he had found, offering as the complete integral depending on two arbitrary functions 
y(u) and (wu) the expression 
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Further work by Liouville enabled him to obtain as a surface of constant negative 
curvature the curve obtained by rotating a tractrix about its axis—the surface later 
known as the pseudosphere. 


25.3 Geometrising Mechanics 


In a paper [187], Liouville observed that the kinetic energy of a conservative system 
is 
mv? = 2U + K), 


ds 
dt? 
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and so the action integrand can be written as the square root of 


where K is a constant. From the definition v = he obtained 
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where h = T — U, the sum of the kinetic energy T and the (negative of what we 
would call the) potential energy U is a constant (the total energy) and 


m n 
S> mids} = >» qjkdqjdqk. 
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Liouville now observed that because the quadratic differential form is positive 
definite it can be written as a sum of squares in a new system of coordinates, so 
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where 
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We can think of the array (P;,) as a matrix, and the array (P/*), which we shall 
shortly meet, as its inverse. Those ideas had not yet fully come into mathematics, 
and Liouville had to write everything out in full. 

He then wrote 


where 6 = 0(q),..-, 8n) 1s a function yet to be determined. It follows that 
3.00 
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Liouville now demanded that 6 satisfies the first-order partial differential equation 
“(x ao \” 
3 (> pass) = 28-0), 
j=l \k=l qj 
A little more work (here omitted) allowed Liouville to write the action integral as 
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It is clear that the integral takes a minimum value when )~* jon njle — nel Wy = 0, 
which occurs when 
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and along this trajectory A = 0(1) — 0(0). 

Thus, as Liitzen ([193], 33) points out, Liouville rederived the Hamilton-Jacobi 
formalism in an elegant way. But it seems that he did not draw any analogy with the 
differential geometry that we are tempted to read in to his work, and most likely did 
not appreciate it. 

The posthumous publication of Riemann’s Habilitation lecture on geometry in 
1867 slowly provoked mathematicians to investigate differential geometry in n 
dimensions. Importantly, Beltrami’s two papers [5, 6] showed how to rigorously 
define intrinsic non-Euclidean geometry in any number of dimensions. In his [188], 
Rudolf Lipschitz showed that Hamiltonian mechanics was possible in such a setting, 
thus keeping alive the idea that these new geometries are candidates for physical 
space.* Lipschitz had already written on mechanics with an eye to geometry in his 
[188] and in a later French summary of his work ([189], 297-298) he wrote? 


One of the principal aims of the research that we shall analyze here was the profound study 
of Gauss’s measure of curvature. In addition to the approaches which have hitherto led to 
this goal one may chose an approach which consists in presenting, in a general way, all the 
fundamental concepts related to the curvature of a surface and then deduce from them the 
concept of the measure of curvature. With this in mind, the idea is to find a definition of the 
radius of curvature which will lend itself to a natural extension. Here the principles admitted 
in ordinary mechanics leads to the following theorem: When a material point which is not 
influenced by any accelerating forces is bound to move on a given surface, the pressure 
exerted in each point of the trajectory is inversely proportional to the radius of curvature of 
this trajectory. Accordingly, one may define the reciprocal of the radius of curvature as a 
quantity which is directly proportional to the resulting pressure of this motion. 


The measure of curvature at issue here is what is called geodesic curvature. Recall 
that a parameterised curve in space has at each point a tangent vector, a principal 
normal that measures the rate of change of the tangent vector, and a binormal (with 
which we shall not be concerned). Just as the tangent vector captures the velocity of a 
point moving on the curve, the normal vector captures its acceleration. Now suppose 
that the curve lies on a surface. At each point, the magnitude of the component of 
the normal to the curve that lies in the tangent plane to the surface is the geodesic 
curvature. It has this name because when that component vanishes the normal to 
the curve and the normal to the surface coincide and there is no acceleration in the 
tangent plane. Therefore, in terms of the intrinsic geometry of the surface, the curve 
is as straight as it can be—the definition of a geodesic. 


4Earlier Schering, Riemann’s successor in Gottingen, had written his [240] on potential theory in 
non-Euclidean geometry. 


5 Quoted in Liitzen ({193], 36). 
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25.4 The Connection to Hamilton—Jacobi Theory 


The geometrical interpretation of the solution of the Hamilton-Jacobi equation is 
interesting and illuminating. It turns out that one can suppose that the S-surface is 
moving in Q space along curves that can be regarded as geodesics in Q space with 
a metric determined by the variational principle of the dynamical problem. 


We shall suppose that the points (g1, ..., gn) occupy a simply connected domain 
Q. We now write the interval [fo, t;] = 7, and consider the space Q x J. We can 
regard this space as being made up of hypersurfaces (q1,..., Gn, ¢) for each fixed 


value of t € I. As we have seen, problems in dynamics often throw up a different 
family of hypersurfaces, given by the equations 


S(Q1,---54n,t) =o, 
where o is a constant. The idea here is that through each point (q1,..., Gn, fo) there 
passes a curve that leads in Q x I to the hypersurface (q\,..., Gn, t))—we will 


see how to specify the right curve shortly, it will be an extremal corresponding to 
a variational principle. We assume that these curves never cross, and so we can 
suppose that the hypersurface (q1,..., Gn, to) flows along them in the direction of 
the hypersurface (q1, ..., Gn, t1). However, we do not assume that the flow is at the 
same speed on each of the designated curves; as the example of geometrical optics 
suggests, commonly one imagines light is travelling along these curves in a way that 
depends on the medium through which it is passing. 

Or, which is not very different, one supposes that at each point Pp of the hyper- 
surface (q1,.--, Gn, fo) there is ametric on Q x J anda geodesic in Q x J that joins 
Po to a unique point P; on the hypersurface (q1, ..., Gn, ti). Now one considers the 
surfaces that are defined as the points a fixed distance o from the initial hypersurface 
(j1, sey ns to). 

In his [188], Lipschitz generalised Hamilton’s principle and deduced conservation 
of energy in the new setting. Then he noted that in the special case where n = 2 and 
U is aconstant his formulation of the Hamilton—Jacobi equation is Gauss’s equation 
(25.1). This suggested to Lipschitz that in any number of variables the Hamilton— 
Jacobi equation can be considered as a transformation of a metric, and the best case is 
when the equations of motion (the Euler-Lagrange equations) in the new coordinates 
Y1, Y2, +++») Yn are solved by equations of the form y; = const. 

He was led to establish this theorem (here and below f (dq) is shorthand for the 
quadratic form in the transformed system): 


Let P(q1,.--,4n,@1,---, 4) be acomplete solution of the Hamilton-Jacobi equation. Fix 
the values of aj, ... a, and consider the family of trajectories of the mechanical system that 
are orthogonal to the (n — 1) dimensional manifold P = A with respect to the form f (dq). 
Then the trajectories are determined by the equations 
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where (32) is the value of oF at the intersection point. Moreover, any other (n — 1) 
i/J0O id 


dimensional manifold P = B will cut the trajectories orthogonally with respect to the form 
J (dq), and the action integral along the trajectories between the two (n — 1) dimensional 
manifolds P = A and P = B will all have the value B — A. 


He also established the converse result®: 


Let P(qi,...,9n) = A denote an (n — 1)-manifold. Consider the family of trajectories of 
the mechanical system cutting this manifold orthogonally with respect to f(dq). On each 
trajectory and on the same side of the (n — 1)-manifold determine a point such that the 
action integral V between the (n — 1)-manifold and this point is equal to B — A. Then these 
points make up an (n — 1)-manifold which is orthogonal to all the trajectories with respect 
to f(dq). Moreover if the action integral V along a trajectory from its intersection with 
P = A toan arbitrary point is considered a function of this latter point, then R = A+ V is 
a solution of the Hamilton-Jacobi equation and the (n — 1)-manifold R = A will coincide 
with the original (n — 1)-manifold P = A. 


Thus, Lipschitz’s work made clear the close analogy between the geometric study 
of geodesics in a manifold and the dynamical study of trajectories in Hamiltonian 
mechanics. To be sure, he spoke of a quadratic form, not a metric. But he knew 
Riemann’s work very well, and Beltrami’s, and he referred to Gauss’s work on 
geodesics. 

It seems that he was unaware of the earlier work of Liouville, and in turn his work 
was unknown to Thomson and Tait, who had stated Lipschitz’s result (for a single 
particle subject to a force) in vol. 1, p. 353 of their book. But other authors did read 
Lipschitz’s paper, among them the French geometer Gaston Darboux. 

Darboux makes an interesting contrast with Felix Klein. Both men saw geometry 
as the natural way to formulate and solve problems, both were energetic writers 
of books as well as research articles. Darboux inclined, as a well-educated French 
mathematician, to the study of differential equations and differential geometry; Klein, 
as a well-educated German mathematician, to projective geometry, with an original 
interest in groups. 

In Volume 2 of his Legons sur la Théorie Générale des Surfaces [58], Darboux 
developed the study of curves on surfaces, the ideas of curvature and torsion, and 
geodesics. Then he turned (in Book 5, Chaps. 6-8) to what he saw as the close analogy 
between Gauss’s theory of geodesics and Jacobi’s theory of analytical mechanics. 
In this spirit, he mentioned the work of Thomson and Tait, introduced Hamilton’s 
principle, and took his readers through Lagrangian and Hamiltonian dynamics in the 
manner of Liouville and Lipschitz. With his lucid exposition the theories of classical 
mechanics and differential geometry were united. 

Indeed, as Darboux puts it ($569) the work of Liouville and Lipschitz 


establishes the principle of least action without the use of the calculus of variations, and by 
methods that are entirely algebraic. 


Liitzen has drawn attention to a particularly attractive point in Darboux’s account 
(Chap. VII, §§571-577). Darboux formally eliminated the time variable from the 
action integral by noting that conservation of energy implies that 


This and the earlier result are quoted in Liitzen ((193], 43-44). 
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where 


2T = > ajedqjdqx.- 
jk 


Therefore, Lagrange’s equations can be written in the form (in Darboux’s disturbing 


notation) 
O 0 
d V(U +h) Sigg ee 
j J 


Odq; 


But this is what is found by saying that the first variation of the integral 
1 1 1 
[ VU+HT = 5 [ VU +h)T SY  ajdqjda 
ik 


vanishes. 
Darboux therefore set 


ds? =2U +h) S > ajxdqjdqy. 
dk 


He called ds the elementary action, and remarked that the general problem of mechan- 
ics is reduced to the study of the extrema of the integral /, ds. But 


This is what the principle of least action consists of; and one sees immediately, thanks to 
this principle, that the general problem of mechanics is only an extension to any number of 
variables of the problem of studying geodesic curves. 


(All that is missing is the fully Riemannian idea that any number of variables together 
with a metric define a geometric space.) 

The final detail was contributed by Heinrich Hertz. His expertise was in physics, 
and he fulfilled his initial promise by being the first person to confirm Maxwell’s 
prediction that electro-magnetic waves travel at the speed of light and that light is 
itself an electro-magnetic wave. The book in which he set out his most fundamental 
ideas about mechanics, his Die Prinzipien der Mechanik in neuen Zusammenhangen 
dargestellt, was published posthumously in the year of his death, 1894. 

Hertz had a dislike of the concept of force—there was in fact a long tradition of 
mathematicians and physicists for whom the word covered up a lack of understanding 
and needed to be replaced. He also disliked the concept of energy, and his own way 
of doing without these concepts was to re-interpret the Lagrangian equations of 
dynamics in terms of some new, “hidden” masses. This seems to have convinced 
no one, but on the way he came up with a thorough-going geometrical reading 
of dynamics that is close to that of Lipschitz and Darboux, but which went one 
step further by calling the quadratic form a metric. Surprisingly, it seems from the 
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definitive study of Hertz’s mechanics that Hertz came to these ideas initially unaware 
of this earlier work ([194], 160); however, much of his later acquaintance with them 
may have affected his exposition. 

It was now possible for mathematicians to say explicitly what Darboux had said at 
length but without the concept of a general space up front: Trajectories in Hamiltonian 
dynamics evolve in phase space along trajectories that are geodesics with respect to 
the metric that is the quadratic form in the energy functional. 


25.5 Exercises 


[In defiance of my earlier proscription! ] 


1. Given a straight line (a geodesic) in the plane and a family of geodesics at right 
angles to it and of equal length, what is the curve formed by the end points of 
these geodesics? 

2. The same question, but for geodesics on the sphere. 

3. The same question, but for geodesics in the disc with a metric of constant negative 
curvature (the non-Euclidean or hyperbolic disc). 


Questions 


1. The extrema of the integral of a Lagrangian with fixed times for the lower and 
upper end points determine the curves along which a dynamical system evolves. 
By studying what happens when the upper end point or time is allowed to vary, 
Hamilton studied how those trajectories evolve. The time-dependent Lagrangian 
satisfies a partial differential equation with respect to which these trajectories are 
the characteristic curves. These curves can be seen as geodesics with respect to 
metric determined by the Lagrangian. If you know about general relativity, this is 
the bridge between Einstein’s approach and Hilbert’s; see what you can find out 
about their competitive rivalry in 1915, starting with Corry [48]. 


Chapter 26 ®) 
The Calculus of Variations in the pectics 
nineteenth Century 


26.1 Introduction 


Lagrange’s theory of the calculus of variations was successful, influential, and, just 
like the early calculus itself, hard to explain. As it was steadily improved, first by 
Legendre and then by Jacobi, it also became clear that it reflected an eighteenth 
century naivety about the nature of functions and had not properly considered the 
range of possibilities for candidate curves. In particular, functions were generally 
taken to be infinitely differentiable. The best nineteenth century theory for this was 
presented by Adolf Kneser towards the end of the century, building on earlier ideas 
of Weierstrass.! The subject also forms the last of the famous Hilbert Problems. 


26.2 After Lagrange 


Lagrange’s elegant enrichment of Euler’s insights created a workable calculus of 
variations, but it rested on a mysterious theory, and as Lagrange’s personal confidence 
in an algebraic foundation for the calculus as a whole found few who shared it, it 
was natural for others to investigate and extend the theory more carefully. 

It was clear, for example, that the solutions found were generally either maxima or 
minima, and that this issue could usually be decided from the context, but it would be 
better to have a way of deciding it on theoretical grounds. Legendre initiated such an 
investigation in a paper of 1788, when he studied what is called the second variation 


of the integral. (The name derives from the analogy with the calculus of functions 
2 


d 
y = y(x): if at a given point ~ = 0 then the sign of “3 can determine if the point 
x x 


is a local maximum or minimum of y(x)). 


‘An excellent historical account, going into more detail than is possible here, is Fraser [107]. 
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Legendre took the integral 


b 
pa f(x,y, y')dx 


and assumed that the function y is an extremal for the integral 7. He then considered 
a function w(x) which varied y(x), writing dy = w(x), where w(a) = 0 = w(bd). 


The first variation of J is 
ae) (6) 
n= | OF yp Clesgr dx, 
a \Oy Oy’ 
and the second variation of J is 


i ar ae Oke a “aie 
h= | (Su TE Oyagh oo ge ) ax. 


The total variation in J is given by 
1 
SP ee er : 


On the reasonable assumption that J; dominates this expression, for an extremal 
y it is clear that 7; must vanish. This ensures the validity of the Euler-Lagrange 
equations for y. To determine whether the extremal is a maximum or a minimum, 
Legendre looked at the sign of J,. After some work, here omitted, he concluded that 
the extremal will be a minimum if 
2 
a > 0. 


This criterion is easy to use, but its derivation was troubling. Legendre produced a 
transformation of the integral J, that converted it into an the integral 


b Of , 
b ah (5) (h(x, y, y'))’dx, 


from which the conclusion follows immediately. But the function h is found by 
solving the differential equation 


OPT OT & of ; 
(5) (S+") ~ (sam +) ey 
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for the unknown function v = v(x). However, Legendre had produced no general 
method for solving this non-linear equation, and Lagrange was able to show that 
solutions may not exist on the whole interval [a, b]. 

The next advance was made by Jacobi, in his major paper [150]—almost 50 
years after Legendre’s. In it, he showed how to use a solution to the Euler-Lagrange 
equations for y to obtain a solution to Legendre’s differential equation. In particular, 
he was able to show that the existence of a solution to the Euler-Lagrange equations 
stands or falls with the existence of solutions to a transformed version of Legendre’s 
differential equation that never vanish in the interval (a, b). But, for whatever reason, 
Jacobi gave no proofs of his results in this paper, and it, therefore, generated a 
considerable amount of research by German mathematicians of the next generation, 
chiefly Otto Hesse, Rudolf Clebsch, and Adolph Mayer, before its conclusions were 
considered fully established in the mid-1850s. 


26.3 Weierstrass’s Theory 


Such were the difficulties inherent in the calculus of variations, and perhaps also 
such was the naiveté about the varied nature of functions throughout much of the 
nineteenth century, that it is only with Weierstrass’s lectures at the University of 
Berlin in 1879 and thereafter that some fundamental issues were confronted for the 
first time (Fig. 26.1). 

One concerns the very idea of a variation, or more precisely, a small variation. 
If one attempts with goodwill to copy the graph of a function, say for definiteness 
y = sin x in the interval [0, 27], one surely draws a smooth curve that looks very 
like the original even though it may agree with it at very few points. However, it 
is easy to convince oneself intellectually that there is, for example, a very wriggly 


Fig. 26.1 Karl Weierstrass 
(1815-1897) by Conrad Fehr 
1895 
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curve that crosses the sine curve very many times and is always closer to the sine 
curve than the first smooth approximation that was drawn. Which of these curves 
should be considered as the smaller variation of the sine curve? 

As put, Lagrange’s theory gives no reason to hesitate: the comparison is made 
only between curves of the first kind, whose values and slopes differ little. But as 
sophistication about curves grew, mathematicians realised that it might be advisable 
to decide whether rapidly oscillating curves were close to slowly oscillating ones, 
or if the marked variation in corresponding values of y’(x) should be taken into 
account. In particular, the English mathematician Isaac Todhunter was worried by 
the possibility of a few abrupt changes in values of y’(x), but he did not produce a 
systematic theory to cope with the difficulties that he had identified. 

Weierstrass, who was alert to the great difference between differentiable and 
merely continuous functions, went much further. Given an integral J of the form 
I= f f(, y, y dx and a function y(x) that is a solution of the corresponding 
Euler-Lagrange equation, Weierstrass considered the curve described by y and a 
comparison curve y(x) that agreed with y at x = a and again at a point P where x = 
c,a <c < bwhere they crossed.’ So their slopes will differ at x = a. He considered 
the excess function 


O 
BOS 5 setae Fe Ge 5): 


Oy’ 


This allowed him to consider comparison curves with different slopes, and he was 
able to show that it is necessary and (he believed) also sufficient for the curve y to 
be a minimum that 


EG, 9, ¥,y') = 9. 


Weierstrass’s analysis began to clarify the nature of the comparison curves, 
because it opened the way to curves that oscillate much more than the extremal. 
Later, Adolf Kneser was able to show that if a minimiser of an integral is sought 
only among curves that differ from an extremal only slightly in respect of both their 
values and the values of their derivatives, then Weierstrass’s necessary and sufficient 
conditions for a minimum are correct. But if comparison curves may differ in the 
values of the derivatives, then Weierstrass’s condition is not sufficient. He called the 
first case the “weak” variation and the second case the “strong” variation. 

In his lectures at Berlin, Weierstrass developed his theory in the formalism of 
parameterised curves. This had the effect of making it harder to use in any but 
geometrical problems, but that has the incidental effect of making it easier to describe 
in general terms. 

At issue is the search for a curve in the plane that joins the points (ag, bo) and 
(a,, b,) and minimises a given integral J. The curves through these points will be 
considered in the form (x(t), y(t)), 0 < t < 1. Given one such curve a nearby curve 
will be written as 


The bar notation for the extremal is due to Kneser; it was not used by Weierstrass. 
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xO + Et), y¥O+no), O<t<1, 


where €(0) = €(1) = 0 = (0) = 7(1). (Precisely what conditions can be imposed 
on the curves and their variations soon became a topic of research—here we shall 
assume continuously twice differentiable.) 

The variation in the integral J can be written down and analysed, and the con- 
clusion is that 6/ must vanish for all admissible functions € and 7. This implies 
Weierstrass’s form of the Euler-Lagrange equations: 


d d 
F,- —Fy =0, Fy —- —Fy = 0. 
dt dt 


These equations are not independent, however, and are equivalent to the differential 
equation 
Fyy = Fyy + F, (x'y" > xy’) = 0, 


where F = F(x, y, x’, y’) satisfies 
Bee hy Fig = ey FP ex Fi 


This function, which Weierstrass showed exists, is “infinite” when x’ and y’ vanish 
simultaneously. Curves (x(t), y(t)), 0 < t < 1 satisfying this differential equation 
are extremals for the problem. A further equation is needed to determine the functions 
x(t) and y(t) precisely, should that be necessary. 

The method works well for finding shortest curve on a surface and joining two 
given points when the surface is given as a graph over a region of the plane, and so 
in the form z = f(x, y). The integral to be minimised is 


1 
J= / (Eu +2Fu'v! + Gv’)"at, 
0 


where u = €(t) and v = 7(t). Here, as usual in differential geometry, with r(u, v) = 
(u, v, Z(u, v)), 
E=r,.0,, F=Vy0%y, G=YyPy. 


As is also usual in differential geometry, the resulting differential equation looks 
intimidating at first but has a conceptually simple interpretation: the acceleration of 
the minimising curve with respect to parameterisation by arc length (the principal 
normal to the curve) is normal to the surface. This means that in terms of the intrinsic 
geometry of the surface there is no acceleration, and so the curve is as “straight” as 
it can be, which makes it a geodesic as required. 

Weierstrass’s method extended to the second variation, and encompassed both 
Legendre’s and Jacobi’s conclusions, and resulted in necessary and sufficient condi- 
tions for both weak and strong minima. The method shows, for example, that in the 
above example the curve is indeed of shortest length between the given end points. 
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26.3.1 Two Examples 


I take these examples from Bolza ([19], 128-129, 210-211). The first requires the 
above characterisation of geodesics, and the second one is straightforward. 

a) Example XI: To determine the curve of shortest length which can be drawn on 
a given surface between two given points. 

If the rectangular coordinates x, y, z of a point of the surface are given as functions 
of two parameters u, v and the curves on the surface are expressed in parameter- 
representation 


u=$(t),v=Y@ (27) 


the problem is to minimise the integral 


1 
j= / V Eu? + 2Fu'v' + Gv2dt, 
0 


where 


Bs te ee ee a 


the summation sign referring to a cyclic permutation of x, y, z. 

The curves must be restricted to such a portion S of the surface that the correspon- 
dence between S and its image T in the u-, v-plane is a one-to-one correspondence. 
We further suppose that E, F, G are of class C” in T and that S is free from singular 
points, i.e. 

EG-—F*>0. 


a) If we use Weierstrass’s form (I) of Euler’s equation, and denote by ®(F) the 
differential expression 


O(F) = yy — Ry + Fi (x'y" —x"y’'), 


we obtain easily 


(J Eu? + 2Fu'v! + Gv?) = Z (28), 
V Eu? +2Fu'v! + Gv? 


where 


1 1 
T=(EG — F?)(u'v" — u"v')+(Eu' + Fu')(Fu= 5 Eu” +G,u'v'+ 5 Gu") 


1 1 
—(Fu' — Gu')(5 Eau” + E,u'v' + (Fy 5 Guv”). (29) 


The extremals satisfy, therefore, the differential equation 
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P=0 (29a). 


This differential equation admits of a simple geometrical interpretation: The geodesic 
curvature of the surface (27) at the point ¢ is given by the expression 


1 r 
Ps JEG — F2/ Eu? + 2Fu'v! + Gu? 


Hence the curve of shortest length has the characteristic property that its geodesic 
curvature is constantly zero, i.e. it is a geodesic. 
For the second example I quote without proof that extremals of an integral 


1 ff 
J= >| (xy’ — x'y)dt 
2 St 


subject to the condition 


ty 
K = /x!2 4 yPdt 
i) 


is constant are found by defining H = F + AG and solving the equation 


Ay ms Ayy — A, (x’y” = x"y’) = 0, 


where 
A, = Hyy |y” = aoe a = yy /x”. 


b) Example XIII: Among all curves of given length joining two given points A 
and B, determine the one which, together with the chord AB, bounds the maximum 
area. [This is Dido’s problem. ] 

Taking the straight line join A and B for the x-axis, with BA for positive direction, 
we have to maximise the integral 


1 f° 
J= al (xy — x'y)dt, 
2 Sit 


while 


t 
K = /x'2 + y2dt 
to 


has a given value, say /, which we suppose greater than the distance AB. Since 


1 / / 
H = 5(xy —x'y) + AV x? + y? 


we get 
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Fig. 26.2 David Hilbert 
(1862-1943) c. 1886 


A,=+ 


Vxt ty? 


and therefore [...] 
xy” _ xy a 1 


/ x/2 + ye r , 


Hence the radius of curvature of the maximising curve is constant and has the value 
|A|, while its direction is determined by the sign of . 

Again, since Hj never vanishes, there can be no corners, and therefore the curve 
must be an arc of a circle of radius |X|. The centre and the radius of the circle are 
determined by the conditions that the arc shall pass through the two given points and 
shall have length /. There are two arcs satisfying these conditions, symmetrical with 
respect to the x-axis. 


26.4 Hilbert’s Problem 23 and the Theory of the Calculus 
of Variations 


By 1900 David Hilbert (Fig. 26.2) was the agreed new leader of mathematics in Ger- 
many. He was the most powerful figure in the growing collection of highly talented 
mathematicians that Felix Klein was bringing together in Géttingen, an authority at 
successive stages in his career on algebraic invariant theory, algebraic number theory, 
and plane geometry—a surprising choice that allowed him to produce a new branch 
of mathematics, the study of axiom systems.* 


3As Adolf Hurwitz, his former student, remarked. 
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Also in 1900, the French had managed to stage the largest celebration of the 
new century: 6 months of Congresses on various topics. Philosophy, physics, and 
mathematics had week-long meetings in August; Poincaré spoke at all three. But 
the Congress of Mathematicians is famous for Hilbert’s contribution. Hilbert’s close 
friend Hermann Minkowski had suggested to Hilbert that he speak on the future 
of mathematics as a particularly appropriate subject, and that is what Hilbert did. 
He offered a general panorama on the dialogue between problems and theory, giv- 
ing Fermat’s last theorem and the brachistochrone problem as his key examples of 
provocative problems that had produced important new theories. Then he presented 
some 23 problems (10 in his address at the congress, all 23 in his published paper). 
They were on a great variety of topics, and over the years, helped by the prestige of 
Gottingen, they became celebrated, and reputations were made for solving one. 

The last five were on analysis, indicative of Hilbert’s shift to a new interest (he 
was occupied with functional analysis for most of the next few years, research that 
provides the origin of “Hilbert space”). Hilbert’s 23rd and final problem in his list in 
Paris connected to the opening words of his lecture, where he spoke of the importance 
of the brachistochrone problem and went on to remark that* 


...for example, the problem of the shortest line plays a chief and historically important part 
in the foundations of geometry, in the theory of lines and surfaces, in mechanics and in the 
calculus of variations. 


A little later in his address he returned to the topic and made the striking remark 
that 


It is an error to believe that rigor in the proof is the enemy of simplicity. On the contrary, 
we find it confirmed by numerous examples that the rigorous method is at the same time the 
simpler and the more easily comprehended. The very effort for rigor forces us to discover 
simpler methods of proof. It also frequently leads the way to methods which are more capable 
of development than the old methods of less rigor. 


He offered some examples and then continued 


But the most striking example of my statement is the calculus of variations. The treatment 
of the first and second variations of definite integrals required in part extremely complicated 
calculations, and the processes applied by the old mathematicians lacked the necessary rigor. 
Weierstrass showed us the way to a new and sure foundation of the calculus of variations. 
By the examples of the simple and double integral I will show briefly, at the close of my 
lecture, how this way leads at once to a surprising simplification of the calculus of variations. 
For in the demonstration of the necessary and sufficient criteria for the occurrence of a 
maximum and minimum, the calculation of the second variation and in part, indeed, the 
tiresome reasoning connected with the first variation may be completely dispensed with to 
say nothing of the advance which is involved in the removal of the restriction to variations 
for which the differential coefficients of the function vary only slightly. 


Then, at the end of his lecture Hilbert outlined the problems posed by the calculus 
of variations in these terms. What he says is not easy to follow, and it will be enough 


4For accounts of Hilbert’s Paris problems, their origins and their influence down the twentieth 
century, see Gray [125] and Yandell [275]. 
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to go straight to his conclusion; note only that he presented a rigorous and ingenious 
argument to his conclusion. In fact, although his reputation has become that of a 
highly abstract thinker, his contemporaries regarded him as a problem-solver first 
and foremost, and without getting into the details of his new presentation of the 
calculus of variations you can see that his definition of J* exhibits the cunning of a 
problem-solver more than the sweep of a theorist. 

His approach soon became an accepted part of the theory. The connection to 
Hamilton—Jacobi theory was appreciated, but so was the simplicity of the derivation. 
It can be found, for example, in Osgood’s paper [208], Bolza’s book ([{19], 92) and in 
the 15-page article by Ernest Zermelo and Hans Hahn on recent further developments 
in the calculus of variations that they published in the Encyklopadie der Mathema- 
tischen Wissenschaften in 1904. Hilbert himself gave a more detailed account in his 
[144]. 


23. FURTHER DEVELOPMENT OF THE METHODS OF THE CALCULUS OF VARI- 
ATIONS. 


So far, Ihave generally mentioned problems as definite and special as possible, in the opinion 
that it is just such definite and special problems that attract us the most and from which the 
most lasting influence is often exerted upon science. Nevertheless, I should like to close 
with a general problem, namely with the indication of a branch of mathematics repeatedly 
mentioned in this lecture — which, in spite of the considerable advance Weierstrass has 
recently given it, does not receive the general appreciation which, in my opinion, is its due 
—I mean the calculus of variations. 


The lack of interest in this is perhaps due in part to the need of reliable modern text books. 
So much the more praiseworthy is it then that A. Kneser, in a work published very recently, 
has treated the calculus of variations from the modern points of view and with regard to the 
modern demand for rigor. 

The calculus of variations is, in the widest sense, the theory of the variation of functions, 
and as such appears as a necessary extension of the differential and integral calculus. In this 
sense, Poincaré’s investigations of the three body problem, for example, form a chapter in 
the calculus of variations, in so far as Poincaré derived from known orbits by the principle 
of variation new orbits of similar character. 

I add here a short justification of the general remarks upon the calculus of variations made 
at the beginning of my lecture. 

The simplest problem in the calculus of variations proper is known to consist in finding a 
function y of a variable x such that the definite integral 


b 
d 
i=) Fi, y;x)dx, y= 2 
a dx 


assumes a minimum value compared with the values it takes when y is replaced by other 
functions of x with the same initial and final values. 


The vanishing of the first variation in the usual sense 
6J =0 
gives for the desired function y the well-known differential equation 


dFy, 
dx 


— F,=0, (26.2) 


[Fy = ie Fy = Syl 


Oyy? 
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In order to investigate more closely the necessary and sufficient criteria for the occurrence 
of the required minimum, we consider the integral 


b 
t= (F + (yx — p)Fp)dx, 


a 


F(p.y.a 
[F = F(p, y,x), Fp = FG2 | 


Now we inquire how p is to be chosen, as function of x, y in order that the value of this 
integral J* shall be independent of the path of integration, i. e., of the choice of the function 
y of the variable x. The integral J* has the form 


b 
re / (Ay, — Bax, 
a 


where A and B do not contain y, , and the vanishing of the first variation 
bJ* =0, 
in the sense which the new question requires, gives the equation [see below] 


OA OB 


oe ht iQ, 
ie as 


i. e. we obtain for the function p of the two variables x, y the partial differential equation of 


the first order 
OF p 


Ox 


O(pF, p— F) 
dy 
The ordinary differential equation of the second order (26.2) and the partial differential 


equation (26.3) stand in the closest relation to each other. This relation becomes immediately 
clear to us by the following simple transformation 


0. (26.3) 


b 
bJ* =} (Fydy t F,6p + (Oyy — Op) Fy + x p)OF py) dx 
a 


b 
= / (Fydy Te byx Fp + Ox = p)6Fp) dx 


a 
b 
=06J +f (yx — p)OF pdx. 
a 


We derive from this, namely, the following facts: If we construct any simple family of 
integral curves of the ordinary differential equation (26.2) of the second order and then form 
an ordinary differential equation of the first order 


yx = p(x, y) (26.4) 


which also admits these integral curves as solutions, then the function p(x, y) is always 
an integral of the partial differential equation (26.3) of the first order; and conversely, if 
p(x, y) denotes any solution of the partial differential equation (26.3) of the first order, all 
the non-singular integrals of the ordinary differential equation (26.4) of the first order are 
at the same time integrals of the differential equation (26.2) of the second order, or in short 
if yy = p(x, y) is an integral equation of the first order of the differential equation (26.2) 
of the second order, p(x, y) represents an integral of the partial differential equation (26.3) 
and conversely; the integral curves of the ordinary differential equation of the second order 
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are therefore, at the same time, the characteristics of the partial differential equation (26.3) 
of the first order. 


In the present case we may find the same result by means of a simple calculation [see below]; 
for this gives us the differential equations (26.2) and (26.3) in question in the form 


Jax Fy. yy + Vx Pyey + Fy — Fy = 0, 


(Px + PPy)F pp t PF py t Fox Fy =0, 


where the lower indices indicate the partial derivatives with respect to x, y, p, y,. The 
correctness of the affirmed relation is clear from this. 


The close relation derived before and just proved between the ordinary differential equation 
(26.2) of the second order and the partial differential equation (26.3)) of the first order, is, 
as it seems to me, of fundamental significance for the calculus of variations. For, from the 
fact that the integral J* is independent of the path of integration it follows that 


b b 
i (FC) + Or — p)Fp(p))dt = / F(5,)dx (26.5) 


if we think of the left hand integral as taken along any path y and the right hand integral 
along an integral curve y of the differential equation 


Yx = p(x, ¥). 


With the help of Eq. (26.5) we arrive at Weierstrass’s formula 
b b b 
[ Fond: [ rGoax= [ 0x. rae. (26.6) 
a a a 


where E designates Weierstrass’s expression, depending upon yx, p, y, x, 


E(yx, P) = F(x) — F(p) — Ox — Pp) Fp(p). 


Since, therefore, the solution depends only on finding an integral p(x, y) which is single 
valued and continuous in a certain neighborhood of the integral curve y, which we are 
considering, the developments just indicated lead immediately — without the introduction 
of the second variation, but only by the application of the polar process to the differential 
equation (26.2) — to the expression of Jacobi’s condition and to the answer to the question: 
How far this condition of Jacobi’s in conjunction with Weierstrass’s condition E > 0 is 
necessary and sufficient for the occurrence of a minimum. 


Hilbert ended his discussion of this problem with some remarks about Kneser’s 
approach to Weierstrass’s theory, which led to a partial differential equation that 
could be considered a generalisation of the Hamilton—Jacobi equation. 

I add a few lines on the derivation of the equation 


OA 4 OB 0 
Ox Oy 
If we set G = Ay, — B then Gy = A)y, — By and Gy = A, so 


d 
Gy — 5 Gy = Ayy, — By — Ax ~ Ayyy 
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and so (note that y’ = y,) 
6J* = A, + By, 


as Hilbert said. 
IT also add a few lines on the derivation of the equation 


(Px + PPy) F pp + PFpy + Fox = Fy =0. 


This comes about through a change of variable argument, in which the variables x 
and y are replaced by the variables x and p. 


26.5 Exercises 


Questions 


1. I am not entirely sure why Hilbert was so interested in the calculus of variations 
around 1900; one possibility is that it was the disparity between its importance 
and its complexity that intrigued him. But it is also possible that it was minimising 
principles like the law of least action that had caught his attention. In 1898-99 he 
lectured on mechanics at Géttingen, and topics included “the energy conservation 
principle, the principle of virtual velocities and the d’Alembert principle, the 
principles of straightest path and of minimal constraint, and the principles of 
Hamilton and Jacobi” and their logical and conceptual inter-relations.> Can you 
form an assessment of these principles and the relationships between them? 


5The quote comes from Corry ([48], 93). 


Chapter 27 M®) 
Poincaré and Mathematical Physics pe 


27.1 Introduction 


In the nineteenth century the wave equation, the heat equation, and Laplace’s equation 
(the Dirichlet problem) were solved. Or rather, and more accurately, they were solved 
for a wide range of domains and initial conditions. But there was a fundamental lack 
of clarity about the initial conditions for these equations, and almost nothing was 
known of the possibility of standard methods, such as the method of characteristics, 
for dealing with more general equations. Even the now-standard classification of 
second-order linear partial differential equations into three types (elliptic, parabolic, 
hyperbolic) was only established in 1889, in a paper by Paul du Bois-Reymond. 

In the 1890s Poincaré shook up the subject of partial differential equations with 
new methods, and shed light on the question of suitable initial conditions. We shall 
consider only his account only of the Laplace equation, which was to lead Hadamard 
to some surprising insights into the existence and uniqueness of solutions of partial 
differential equations. But Poincaré wrote widely on many aspects of partial differ- 
ential equations and applied mathematics; his discussions of eigenvalue problems 
being a notable success there is not room to discuss in this book.! 


27.2 The Classical Classification of Linear Partial 
Differential Equations 


The now-standard division of linear second-order partial differential equations into 
elliptic, parabolic, and hyperbolic seems to have been introduced surprisingly late, in 
a paper Paul du Bois-Reymond published in the Journal fiir die reine und angewandte 


'See Verhulst ({260], Chap. 11). 
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Mathematik on the subject in 1889. By the time the paper appeared he had died, at 
the age of 57. 

He had been introduced to mathematical physics by Franz Neumann, who taught 
him about fluids at the University of Ziirich, and the subject became the topic of his 
Ph.D. at Berlin in 1859. He then pursued a career in mathematics, and eventually 
became a Professor at the Technical University in Berlin in 1884. Although some 
criticised his work for lack of rigour, he made a number of interesting discoveries 
about Fourier series representations, and about the growth of functions and infinite 
numbers. 

The non-degenerate second-order linear partial differential equation with constant 
coefficients for an unknown function u in two variables x and y can be reduced to 
one of three forms: 

Uxy + Uyy + 2au, + 2buy +c = d; 


Uxy — Uyy + 2au, + 2buy +c =d; 
Uxx +Uuy + 2au, +c=d, 
where a, b, c, d are constants. It is also easy to see that introducing the new variable 
“= edt y 
further reduces these equations to 
Urx + Vyy +kv = f(x, y); 
Urx — Vyy t+kv = f(x, y); 
Urx + Vy = f(x, y), 
where k is a constant. 
Du Bois-Reymond wrote the general partial differential equation of the type he 
was considering in the form 
F(z) =Rr+Ss+Tt+ Pp+Qq+2Zz=0, 
where p,q,7r, 5, t, have their usual meanings as the various first (p, g) and second 
(r, Ss, t) derivatives of z,and R, S, T, P, Q, Z are sufficiently differentiable functions 
of x and y. 
Differentiation gave him equations such as 


dp =rdx+sdy, dq=sdx+tdy, 


and so on, and du Bois-Reymond deduced that 
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Rdpdy + Tdqdx — s(Rdy* — Sdxdy + Tdx”) + Mdxdy = 0, (27.1) 


where I] = Pp + Qq+Zs. 
He applied these equations to an arbitrary curve C on a solution surface, which 
necessarily satisfied these equations: 


dz = pdx + qdy, dp =rdx+sdy, dq =sdx+tdy, F =0, 


and observed that two of z, p, g, r,s, t remain arbitrary on C but once they are given 
the other four can be found by integration. The simplest assumption to make, he said, 
is that z and one of p or q is given arbitrarily, and that z and the tangent plane to 
the surface are given along a curve C. More precisely, he said, no more than two of 
Z, p,q can be arbitrary, and for curves that project to a given plane curve only one 
is at our disposal. 

Consider, however, when the equation of the curve is such that 


Rdy’ — Sdxdy + Tdx* = 0. (27.2) 
This equation can be written in the form 


(dy — N,dx)(dy — N_dx) = 0, 


where Ny = ( St/S*—4R T) , and, he remarked, the solutions to the equations 


dy — Nydx =0, dy — N_dx =0 


define two families of curves that cross at every point of the (x, y)-plane when S* — 
4RT > 0. These curves are the projections onto the (x, y)-plane of the characteristics 
of the partial differential equation F = 0. 

The situation on an arbitrary curve and a characteristic curve are very different, 
du Bois-Reymond pointed out, in terms of what is known along them, and this is 
connected, he went on, to the question of what has to be given before an integral 
surface is determined. He had investigated this matter in an earlier paper, he said, 
and here he preferred to draw attention to “the most important point” (p. 245), 
which was how widely different the boundary conditions are for real and imaginary 
characteristics. In the case of positive characteristics (real characteristic curves) a 
small change in the curve C changes the surface inside the region bounded by C and 
the characteristics at its end points (du Bois-Reymond thought of these three curves 
as forming a triangle). But with imaginary characteristics, when S* — 4RT < 0, the 
entire solution surface is determined by an arbitrarily small part of the curve C. In 
the case of real characteristics du Bois-Reymond noted that if one attempts to define 
the surface initially along a characteristic, then one needs an extra arbitrary function 
that is not required when starting with a curve that is not a characteristic (one might 
say that the moral is that characteristics make bad boundary curves). 
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After discussing a number of other topics, du Bois-Reymond turned in Chap. 4 
of his paper to the reduction of second-order linear partial differential equations to 
canonical form. He showed that a linear change of variables x and y can have these 
consequences: 


1. When S? = 4RT, if one of R or T vanishes then so does S, and the equation can 
be made to take the form 


Rr+Pp+Qq+2Zz=0, orTt+ Pp+Qq+Zz=0. 


2. When S? — 4RT > 0, then either S or both R and T can be made to vanish, and 
the equation can be made to take the form 


Ss+Pp+0QOq+Zz=0, orRr—Tt+Pp+Qq+Zz=0. 


3. When S? — 4RT <0, then S can be made to vanish, but not R and 7, and when 
S vanishes, the new R and T will have the same sign, and the equation can be 
made to take the form 


Rr+Tt+Pp+Qq+Zz=0. 


At this point he wrote (1889, 265): 


I shall call the differential equations in the two first forms parabolic, the second two hyper- 
bolic, and the third form elliptic. 


So any linear, second-order partial differential equation can be reduced to one of 
these forms, provided the condition on R, S, and T is satisfied, and the task of the 
theory of such equations is to study the solutions appropriate to each form. 

In Chap. 6 of the paper, du Bois-Reymond offered a proof that a hyperbolic (linear, 
second-order) partial differential equation can be solved when one is given an arc 
P, P4 that meets no characteristic more than once, and along which z and p or z and 
q (and thus z, p, and g are given), and the solution holds in the region bounded by 
the arc and the characteristics through P2 and P,. He argued that the equation can be 
taken in the form 

F(z)=s+up+vuq+wz=0, 


when the characteristics are x = const. and y = const.. In this case, the key result 
is that the solution of the partial differential equation is known in a rectangle when 
it is known on two adjacent sides, and the data is continuous at the common corner. 
Discontinuities on either edge will propagate into the rectangle. But he admitted that 
his argument was intuitive and inconclusive, and that a rigorous proof would be hard 
to find. All he had was a power series argument for which a good convergence result 
was lacking. 
He ended his paper several pages later, remarking that 
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In this article, I have pulled together the few cases where, so far as I can see, the principal 
integral can be written down at once. Leaving aside a few particular cases, such as Riemann’s 
equation for the propagation of sound, it seems that from now on new and more general 
methods must be found, that I will explain in future papers. 


27.3 Poincaré and the Dirichlet Problem 


In 1890 Poincaré wrote a paper [216] on the partial differential equations of math- 
ematical physics that became much quoted. He began by observing that a number 
of problems in physics—electrostatics, electrodynamics, the propagation of heat, 
optics, elasticity, and hydrodynamics—all lead to the same family of partial dif- 
ferential equations. Among these is Laplace’s equation, which raises the Dirichlet 
problem, where many different boundary conditions can be handled by the method 
of Green’s functions, but there are several others. After surveying them, he went on 


Unfortunately, the first property common to all these problems is their extreme difficulty. 
Not only can one often not solve them completely, but it is only at the price of the greatest 
effort that one can rigorously prove the possibility. 


So, he asked himself, is all this hard work necessary? After all, most physicists do 
very well, guided by their experiments. But, he said, analysis ought to be able to do it, 
and a rigorous proof that a problem can be solved may be quite unsuited to providing 
numerical estimates but it teaches us something. Should we nonetheless relax the 
demand for rigour, on the grounds that the differential equations themselves have 
often been established by less than rigorous arguments, and experimental results are 
necessarily approximate? He rejected this too: how can one decide if a less than 
rigorous argument is valid? Who has the right to say that an argument insufficient 
for a mathematician is good enough for a physicist? Moreover, he concluded, it is 
hard to give up a problem that has not been completely solved, and some of these 
equations also play a role in pure analysis (in Riemann’s work, for example). 

Poincaré then set out a new method for solving the Dirichlet problem; the physi- 
cal context made him cast the problem in three dimensions, which was a significant 
mathematical advance. He remarked that the problem was known to always admit a 
solution, and that Riemann had proved this—remarks that no German contemporary 
in the field would have accepted, even in spirit. Then he noted, more securely, that 
solutions had been provided by Schwarz and Carl Neumann in Germany and Robin 
in France. After that, he presented his own solution, known as the method of “sweep- 
ing out” (“balayage” in French), which has something in common with Schwarz’s 
alternating method. 

Poincaré used Green’s theorem, which says that a solution to a Dirichlet problem 
is obtained by finding a suitable Green’s function. He also used a rigorous version of 
another of Green’s theorems to show that given a positive electric charge at a point 
inside a virtual sphere one can arrange for a charge distribution on the sphere with 
the property that the potential functions of the two distributions agree outside the 
virtual sphere. 
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We can give an informal argument to this effect. It is trivially true of a point charge 
at the centre of a virtual sphere: the potential outside the sphere is the same as that 
of a uniform charge distribution on the sphere. If we now move the point, keeping 
it inside the virtual sphere, we expect that the charge distribution on the sphere can 
vary in such a way as to produce a new potential function equal to that of the charge 
in its new position. 

Poincaré gave the formulae for this, but left it to the reader to check; in fact this 
had been done first by Thomson and then repeated by Maxwell in his great book on 
electromagnetic theory.” 

More specifically, Poincaré showed that a Green’s function u relative to a sphere 
of radius R and concentrated at P is the potential function associated to a unit mass 


placed at P inside the sphere and a mass of — a placed at Q outside the sphere.* 


dU 
The contribution to oe of a point M on the sphere is given by 
n 


R?—OP? 1 
R MP3" 


If the point P is inside the sphere, the potential of a unit charge at P is equal to a 
at P. 

If, therefore, a unit charge is distributed over the sphere in such a way that the 
charge density at each point M on the sphere varies inversely with M P°, then, 
Poincaré argued that the potential W of this distribution equals ar at points M’ 
outside the sphere and at points M” inside the sphere the potential is less that a: 
From this he deduced that a function equal to the Green’s function U inside the sphere 
and zero outside the sphere is harmonic everywhere except at P and on the surface 
R? — OP? 
ae . 4 RMP? 
This function is equal to the potential function associated to a unit charge at the 
point P and the potential function associated to a charge density on the sphere given 

R*? — OP? 
by —-——__.,, 

4tR.MP3” ; ae ; 
Therefore the function V that is harmonic inside the sphere and takes prescribed 
values on the sphere, given by a function V®, is defined by the equation 


1 V°(R? — OP? 
V(M") = / : ee 
S 


of the sphere, where it is continuous but its normal derivative jumps by 


which differs from the formula above only by a change of sign. 


4rR M" P3 


The method involves inversion in spheres. Inversion in a sphere S with centre O and radius R maps 
a point P to the point Q on the half-line from O through P and such that OP.O Q = R? (the map 
is not defined at O). The map switches P and Q, and therefore switches the inside and the outside 
of the sphere; it is an anti-conformal map (like a reflection). In the plane it is an inversion in a circle 
(see Chap. F). A harmonic function is transformed by an inversion of its domain to another harmonic 
function with its singular point somewhere else. Traces of this process are visible in Poincaré’s map. 


3Note that if the sphere is inverted into a plane, P and Q become mirror image points; see Sect. D.1. 
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Importantly, its maximum and minimum values lie between those of the given V°, 
which can be assumed to be positive. Our earlier informal argument could not guar- 
antee this crucial detail, which will be used to guarantee the existence of a lower 
bound later on. 

Poincaré now showed how to solve the Dirichlet problem for the region R outside 
an isolated charged conductor of an arbitrary shape, provided only that it has a tan- 
gent plane at every point and distinct principal curvatures—conditions that enabled 
Poincaré to establish some convergence arguments. First, he indicated briefly that 
the region R outside the conductor can be covered by an infinite number of spheres 
S;, j =1,2,... of various sizes so that each point of the region R lies in at least 
one of these spheres. Then he proposed to define a harmonic function on the region 
R that tends to the value | at points P on the surface of the conductor and that tends 
to the value 0 as P moves off to infinity. 

To do so, he observed that given a sphere S$; and the electric charge it contained, the 
charge distribution can be switched with an equivalent one entirely on the surface of 
the sphere. This has no effect on the potential outside the sphere, which is unchanged, 
and reduces the potential inside the sphere. He called this operation “sweeping out” 
the sphere. 

He started with a large external sphere & that surrounds the conductor and has a 
uniform charge distribution on it that gives rise to a potential Vo outside the sphere 
(the potential goes to zero at infinity) and a constant potential of 1 everywhere inside 
it, including the conductor. Essentially his argument is that the spheres are swept 
out, lowering the potential function by a sequence of harmonic functions inside them 
but not altering the potential outside them. So the potential function continues to 
take the value 1 on the boundary of the conductor but otherwise drops. It cannot 
become negative, because all the charges introduced are positive, so by Harnack’s 
theorem—f an increasing or decreasing sequence of harmonic functions is bounded 
it tends to a limit that is a harmonic function—the limit is everywhere harmonic and 
it is unaltered on the boundary of the conductor, so it continues to take the value 1 
there. 

In slightly more detail, at least one of the spheres $; meets & and contains some 
of the charge distributed over &. Poincaré let S; be such a sphere and swept it out. 
The potential function becomes V,, and there is no charge inside S;. He now swept 
out Sz, an operation that can put some charge back inside S;. Now he swept out S; 
and S$. Then he turned to $3, and so on. To sweep out every sphere infinitely often, 
Poincaré swept them out in this order 


Sy, S23 Sy, Sz, S33 $1, Sx, 53, S43 .-.. 


If the nth sweeping out operation empties the sphere S, the potential resulting func- 
tion V, agrees with the preceding one, V,_;, outside S; and inside S, it is less: 
V, < Vn—1. So everywhere one has V,, < V,-;. Because a negative charge never 
occurs, the decreasing sequence of V,,s is bounded below at every point and so tends 
to a limit, a function Poincaré called V. 
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Consider now the jth sphere S;, which is swept out infinitely often, say at times 
ax, k = 1,2,.... Each time there is no charge in its interior and the corresponding 
potential function V,, is harmonic. Now, the sequence of values of the V,, tends 
to a limit and so Poincaré used Harnack’s theorem from 1887 to deduce that the 
limit function V is also harmonic. But every point of the region R lies in at least 
one sphere so, said Poincaré, there is a harmonic function defined everywhere on R. 
Because everywhere one has Vp > V > 0, and Vo tends to zero at points arbitrarily 
far from the conductor, it follows that V tends to zero at points arbitrarily far from 
the conductor. 

To show that the potential function V(P) tends to 1 as the point P tends to a 
point M, say, on the conductor, Poincaré invoked his assumptions on the shape of 
the conductor to allow him to define a sphere that touches the conductor at M and 
otherwise lies entirely inside the conductor. A limiting argument allowed him to 
show that the potential function on a sequence of points tending to M from outside 
the conductor tends to 1, as required. He concluded that the Dirichlet principle had 
been established; it would have been more accurate to say that the Dirichlet problem 
had been solved. 

Poincaré then spent some time weakening the conditions he had imposed on the 
boundary of the conductor, so that it could, for example, have a finite number of cone 
points. 

Then he defended having introduced a new method into an already popular field, 
although it was no better than those of Robin or Neumann and in some cases actually 
slower. But he argued that no known method allowed one to go beyond the first 
approximation without calculations that were too repellent, and so the skilled analyst 
will welcome a new method, and his, he remarked, was particularly elastic (“if I may 
use the term”—[216], 231). He then proceeded to show how it can be adapted in 
various ways. 


27.4 Exercises 


Questions 


1. Du Bois-Reymond’s classification of linear second-order partial differential equa- 
tions is the formal face of a fundamental division. His presentation emphasises the 
varying nature and therefore role of the associated characteristic curves. Reach 
the same classification by looking at boundary or initial data for these equations. 


Chapter 28 M®) 
Elliptic Equations and Regular ie 
Variational Problems 


28.1 Introduction 


This chapter, and Chap. 29 on hyperbolic equations, concludes this history of differ- 
ential equations. Topics that emerge of considerable importance are the regularity of 
the solutions of elliptic equations—this was a particular interest of David Hilbert’s— 
and the introduction of more rigorous methods in potential theory. 


28.2 Picard on Second-Order Linear Elliptic Equations 


In a paper in the Journal de Mathématiques for 1890, Picard considered the linear 
second-order partial differential equation 


Oru Ou Ou Ou Ou 
A 2B = F(u. —. — 
Ox? * OxOy = Coy (u, Ox’ ay “ 


)s 


in which the coefficients are functions of x and y inadomain for which B? — AC < 0. 
He showed that the solutions are determined by their values on the boundary of the 
domain, provided the domain is suitably small (a condition that ensures that the 
solution is single-valued and is therefore a function of x and y). 

He took the equation in the form 


Au = Ux, + Uyy = F(u, uy, Uy, X, y). 
To solve it, he took an arbitrary function u;(x, y) and formed the equation 
Auz = F(uy, uy, Uy, x, y), 
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which he supposed had the solution u2(x, y). He then formed the equation 
Au3 = F(ug, U2x, U2y, x, y); 


which he supposed had the solution u3(x, y), and so on. Each solution is fixed by 
its values on the boundary of the domain. If the sequence of functions wu), v2, u3,... 
converges to a function u(x, y) then the limit function would be a solution of the 
equation 

Au = F(u, ux, Uy, X, y), 


as required. The issue therefore is to find conditions that guarantee the convergence 
of the sequence of functions. 
Picard began with the case where 


F(u, uy, uy,xX, y) =au, + buy +c, 


where a, b, and c are functions of x and y. Then he dealt with the general case, 
and then with the special case where F is a function of u, x, and y alone, and is 
also an increasing function of u, when he showed that no restrictions on the domain 
are necessary. His method was to establish the theorem for small contours, and 
then extend it to arbitrary ones by looking at overlapping contours, as in Schwarz’s 
alternating method, to which he explicitly referred. 

This case included the partial differential equation 


Uxx + Uyy = A(x, ye", 


to which Picard paid special attention. As Liouville had showed, this is the equation 
for a surface with metric E = G =e", F =0 to have its curvature given by the 
function A(x, y). 

Finally, Picard observed that the same method of successive approximations could 
be used to prove the existence of solutions to ordinary differential equations.! 

We can look a small way past the introduction and glimpse the subtleties of the 
problem. One case that Picard looked at was the equation 


Cu Ou 
Ox + ay? = f(x,y). 


In this case, the solution of the partial differential equation is 


1 
ugm=—s | f fe, y)G(x, y, €, n)dxdy, 


'The method had been used earlier by Liouville in connection with Sturm—Liouville theory, see 
Liitzen ({192], Chap. X). 
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where the double integral is taken over the given surface and G is a Green’s function 
that becomes infinite at the point (€, 77) of the surface like log(1/r) and vanishes on 
the boundary.” 

Picard investigated this solution and found that it was necessary to ensure an upper 
bound on ou and ou This he could do when the boundary was either a circle or a 
curve analytically equivalent to a circle. 


For the more complicated equation 


Pu Ou 
ax t dy? = au, + buy +c, 

where au, + buy + c is continuous, it was not possible to write down the solution, 
and so an iterative approach had to be used. Picard now showed that the sequence 
of approximations converged provided the boundary curve was small, by which he 
meant that the solution function u(x, y) has no chance to grow uncontrollably and 
thereby cease to be single-valued. 

In a paper he then published in the Journal de l’Ecole Polytechnique in the same 
year, 1890, Picard now imposed the condition that the coefficients are analytic func- 
tions of x and y and were able to show that in this case the solution is also analytic. 
His method was the method of successive approximations. 


28.3 Hilbert’s Problems 19 and 20 


This and the next extract are taken from the published version of Hilbert’s address 
on the problems of mathematics at the ICM in Paris in 1900, ([143]). 


19. ARE THE SOLUTIONS OF REGULAR PROBLEMS IN THE CALCULUS OF VARI- 
ATIONS ALWAYS NECESSARILY ANALYTIC ? 


One of the most remarkable facts in the elements of the theory of analytic functions appears to 
me to be this: That there exist partial differential equations whose integrals are all of necessity 
analytic functions of the independent variables, that is, in short, equations susceptible of 
none but analytic solutions. The best known partial differential equations of this kind are the 
potential equation 
of PF 
Ox2 © Ax2 
and certain linear differential equations investigated by Picard [J. Ec Poly 1890]; also the 
equation 
of Of f 
eee ii eM eee gi 
Ox? — Ox? , 


?Later, in §4 of his paper, he showed that u will exist if f is not even required to be continuous, but 
for u to be twice differentiable it is necessary that f be continuously once differentiable or at least 
satisfy some sort of Hélder condition (not to be discussed here). 

3He defined an analytic function of two variables to be one that can be written as a convergent 
power series in the variables. 
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the partial differential equation of minimal surfaces, and others. Most of these partial differen- 
tial equations have the common characteristic of being the Lagrangian differential equations 
of certain problems of variation, viz. , of such problems of variation 


// f(p.9,% x, y)dxdy = minimum 


Oz Oz 
WN ae re eae 
as satisfy, for all values of the arguments which fall within the range of discussion, the 
inequality 


Op? 0q? \ Opdq 

f itself being an analytic function. We shall call this sort of problem a regular variation 
problem. It is chiefly the regular variation problems that play ardle in geometry, in mechanics, 
and in mathematical physics; and the question naturally arises, whether all solutions of 
regular variation problems must necessarily be analytic functions. In other words, does every 
Lagrangian partial differential equation of a regular variation problem have the property 
of admitting analytic integrals exclusively? And is this the case even when the function is 
constrained to assume, as, e. g., in Dirichlet’s problem on the potential function, boundary 
values which are continuous, but not analytic? 


Lf ef (2! ) 
> 0, 


I may add that there exist surfaces of constant negative Gaussian curvature which are rep- 
resentable by functions that are continuous and possess indeed all the derivatives, and yet 
are not analytic; while on the other hand it is probable that every surface whose Gaussian 
curvature is constant and positive is necessarily an analytic surface. And we know that the 
surfaces of positive constant curvature are most closely related to this regular variation prob- 
lem : To pass through a closed curve in space a surface of minimal area which shall enclose, in 
connection with a fixed surface through the same closed curve, a volume of given magnitude. 


Hilbert then went on 


20. THE GENERAL PROBLEM OF BOUNDARY VALUES. 


An important problem closely connected with the foregoing is the question concerning the 
existence of solutions of partial differential equations when the values on the boundary of 
the region are prescribed. This problem is solved in the main by the keen methods of H. 
A. Schwarz, C. Neumann, and Poincaré for the differential equation of the potential. These 
methods, however, seem to be generally not capable of direct extension to the case where 
along the boundary there are prescribed either the differential coefficients or any relations 
between these and the values of the function. Nor can they be extended immediately to 
the case where the inquiry is not for potential surfaces but, say, for surfaces of least area, 
or surfaces of constant positive Gaussian curvature, which are to pass through a prescribed 
twisted curve or to stretch over a given ring surface. It is my conviction that it will be possible 
to prove these existence theorems by means of a general principle whose nature is indicated 
by Dirichlet’s principle. This general principle will then perhaps enable us to approach the 
question : Has not every regular variation problem a solution, provided certain assumptions 
regarding the given boundary conditions are satisfied (say that the functions concerned in 
these boundary conditions are continuous and have in sections one or more derivatives), and 
provided also if need be that the notion of a solution shall be suitably extended?(Cf. my 
lecture on Dirichlet’s principle in the Jahresbericht der Deutschen Math.-Vereinigung, vol. 
8 (1900), p. 184.] 
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Hilbert then published two papers on the Dirichlet problem. The first is an indi- 
cation of what he developed at length in the second. It would take us too far afield to 
follow him and to sort out what he did, and did not, achieve with them, but we can 
see how he intended to elucidate his programmatic remarks in Paris by looking at an 
extract from the first paper ({142]). 


The Dirichlet principle is a method that Dirichlet, drawing on an idea of Gauss, used to 
solve the so-called boundary value problem, and which can be briefly characterised in the 
following way. One erects verticals at the points of the given boundary curve in the (x, y)- 
plane and gives them the corresponding boundary values. On the surface z = f(x, y) that 
is bounded by this curve one looks for a surface the minimises the value of the integral 


10 ff (GE «())o 


This surface, as one can easily see using the calculus of variations, is necessarily a potential 
surface. With the use of considerations of this kind Riemann gave a proof of the existence 
of the solution to the boundary value problem and then immediately based his great theory 
of Abelian functions upon it. 


It was first recognised by Weierstrass that this use of the method of the Dirichlet principle 
is not sound; indeed, if only a finite number of numerical values are given one can con- 
clude without further ado that there must be a least numerical value among them; from an 
unbounded number of numerical values one cannot conclude that there is least one; rather, 
it requires a proof that in the given case there is a surface z = f(x, y) which gives the least 
value of the integral J(/). 


The important researches of C. Neumann, H.A. Schwarz and H. Poincaré have shown that 
under certain very general assumptions about the nature of the boundary curve and the 
boundary values the boundary value problem is solvable and therefore the existence of a 
minimal function f(x, y) is assured. 


The Dirichlet principle owes its fame to the attractive simplicity of its fundamental mathe- 
matical idea, to the undeniable richness of its possible applications to pure and to physical 
mathematics and its intrinsic plausibility. But since Weierstrass’s critique the Dirichlet prin- 
ciple became of historical value only and seemed to lose its ability to lead to solutions of the 
boundary value problem. C. Neumann spoke regretfully that the so beautiful and so much- 
used Dirichlet principle would now always decline; only A. Brill and M. Noether called for 
new hope to grow in us and expressed the conviction that the Dirichlet principle, present in 
nature, could once more enjoy a revival, perhaps in a modified fashion. 


The following is an attempt at a revival of the Dirichlet principle. 


Inasmuch as we think of the Dirichlet problem as only a particular problem in the calculus 
of variations, we have been led to express it in the following more general form. Every 
regular problem in the calculus of variations has a solution provided suitable restrictions are 
imposed on the given boundary conditions and if necessary the idea of a solution is suitably 
extended.* 


How this principle can be used as a guide to the discovery of rigorous and simple existence 
proofs will be shown by the following two examples: 


I. Draw the shortest curve between two given points P and P; ona given surface z = f(x, y). 


Let € be the lower bound on all curves on the surface between the two points. From the 
totality of all connecting curves we look for those curves C;, C2, C3, ... whose lengths 


“Hilbert here footnoted his Paris address and the papers ({14, 15]). 
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Li, L2, L3,... respectively approach the limit 2. On C; we draw from P a length 5Li and 
obtain on C; the point P, 1; on Cy we draw from P a length 5L2 and obtain on C2 the 
2 


point Py. 1; on C3 we draw from P a length 5L3 and obtain on C3 the point P; 1, and so 
on. The points P, 1, Py 1. Ps Joes have an accumulation point P 4 which is also a point of 
the surface z = f(x, y). 

This procedure, which we have applied to P and P, alike, and has led to the point P 4 we 
now apply to the points P and P 4 and obtain in this way the point P)/4 on the given surface, 
and also the point P3/4 when we apply the procedure to P 4 and P;. In the same way we find 
the points Pg, P3/g, Ps/g, P7/g, Pii6,---- All these points and their accumulation points 
taken together form a continuous curve that is the sought-for shortest curve. 


The proof of this fact is easily found when one thinks of the length of a curve as defined 
as the limiting value if the lengths of inscribed polygons. As we see at the same time, it 
is necessary for this approach that we assume that the given function f(x, y) and its first 
differential quotients with respect to x and y are continuous. 


Hilbert’s second example was that of the Dirichlet problem itself. Unfortunately, 
his argument is too long to reproduce here. 


28.4 Exercises 


Questions 


1. Hilbert’s interest in the regularity of solutions to partial differential equations is 
perhaps a pure mathematician’s attitude. Do you agree? What are the implications 
for physics of his conjecture (when it is proved, as it shortly was)? 


Chapter 29 ®) 
Initial Value Conditions for Hyperbolic cies 
Partial Differential Equations 


29.1 Introduction 


By the late nineteenth century it was becoming clear that solutions to hyperbolic 
partial differential equations have a particular kind of relation to the initial conditions 
that can be imposed on them. This had become increasingly clear in the later decades 
of the nineteenth century, as the work of Picard and others show; the person who 
cleared this up decisively was the French mathematician Jacques Hadamard in the 
early years of the twentieth century, who made a powerfully provocative study of the 
relation between elliptic and hyperbolic partial differential equations and between 
boundary and initial conditions. 


29.2 Picard on Second-Order Linear Hyperbolic Equations 


In the same paper in the Journal de Mathématiques for 1890 that we have already 
looked at, Picard showed how to use the method of successive approximations to 
solve hyperbolic second-order partial differential equations. One of his examples, 
the equation 
2 
a°z =a dz 4b 
axdy ox 


Oz 
+z, (29.1) 
dy 


where a, b,c are functions of x and y, is reproduced in Sect.31.8, but here it is 
more worth repeating his analysis of why his method for solving hyperbolic partial 
differential equations would not work for elliptic ones. He has posed the problem of 
solving the partial differential equation for which the solution is to take a prescribed 
value at a point on an arc AB and its first partial derivatives are to take prescribed 
values along the arc. 
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It is quite otherwise when the characteristics are imaginary. To see this, it suffices to take 
the simple example of the equation 


In general one cannot have a solution of this equation that is continuous in the rectangle 
ABA’'B’ along with its first-order partial derivatives, and for which az and ge take on the arc 
AB the succession of values denoted above by v(x) and y(y), these functions being subject 
to no other condition than being continuous. In the contrary case one could, in effect, form 
an analytic function z + iz; that will be holomorphic in the rectangle under consideration, 
the real part of this function being arbitrary on the curve AB, which is impossible because a 
holomorphic function determined on an arc of a curve however small can only be extended 
in a unique way. 


29.3. Hadamard and Mathematical Physics 


Jacques Hadamard (Fig. 29.1) was a remarkable analyst, the first (with de la Vallée 
Poussin, independently) to prove the prime number theorem, which says that the 
number of prime numbers less than a real number x is well approximated by x/ In x, 
but his particular fields were integral equations, the calculus of variations, and partial 
differential equations. 

But before we proceed, I cannot resist this anecdote, which dates from a scientific 
Jubilee in honour of Hadamard in 1937. Picard had wondered if Hadamard would 
still remember his lectures on rational mechanics and Hadamard replied!: 


It is perfectly true that you had accepted the task — should I say the burden — of involving 
us in that artificial and lamentably monotonous exercise that is the problem of mechanics 
for the degree. You had been able to render it almost interesting; I always asked myself how 
you were able to do that, because I was never able to when it was my turn. 


In his paper [132] he made some important remarks about the two types of partial 
differential equation that typically arise in physics: the Dirichlet problem and the 
Cauchy problem. In the Dirichlet problem in which the unknown function (say, a 
function of two variables) is required to satisfy a given condition at each point on 
the boundary of its domain. In the Cauchy problem, the boundary information is the 
value of a function and one of its first derivatives at each point on some boundary. 

“These problems”, he said, “are presented in every sort of question in mathemat- 
ical physics. However, there is an extensive list of cases in which one or the other 
is presented as if it is well posed, I want to say as possible and determined”. What 
he wished to point out, he said, was that “these two circumstances are intimately 
related, the one to the other, and this in a sufficiently close way that of the two prob- 
lems, entirely analogous in appearance, one can be possible and the other impossible 
according to how they correspond or not to a physical given”. 


'See Cartwright ([33], 77). 
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Fig. 29.1 Jacques 
Hadamard (1865-1963) 


Hadamard’s language needs a little unpicking. By a possible problem, he meant 
one admitting a solution, and by a determined problem one that has a unique solution. 
His discovery was that the two boundary conditions work very differently, even 
though the equations look very similar, and that one problem may have a solution 
when the other does not depend on whether the boundary conditions make physical 
sense or not. 

He illustrated his point with two examples. Laplace’s equation (A) in three dimen- 
sions leads to Dirichlet’s problem, which is a possible and determined problem. On 
the other hand, the Cauchy problem for equation (A) asks for the determination for 
x > 0 of a solution such that, for x = 0, 


Ou j 
u=Uo, 7>— = Uo, 


Ox 
where uo and uo’ are given functions of y, z. “This problem” he said (p. 214), “which 
has no physical significance, can always be solved when uo and u'9 are analytic, but 
we know today that it is quite otherwise in the general case”, and he gave a brief 
explanation of why this was so. 

Suppose, for example, that wp has been defined on a circle C in the plane x = 0 
and the hemisphere S in the region x > 0 bounded by that circle and within which the 
Dirichlet problem is known to be solved. The unknown function u will be determined 
by the values it takes on C and S, because they define a region for which the Dirichlet 
problem is known to be solved. The part corresponding to the values on S defines 
an analytic function of x, y, z inside C, so one can say that W is the potential of a 
double layer distributed in the (y, z) plane, the thickness (density?) of this double 
layer being represented at each point by uo. However, the Cauchy problem is only 
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possible if 
1 oW 


0-=—— 


2m Ox 


/ 


is an analytic function. 
Then he considered the wave equation (B). The Cauchy problem in this case asks, 
find the solution for t > 0 such that when t = 0 


Ou ; 
u=U and —=U', 
ot 


U and U’ being given functions of x, y, z. Such a problem is, in general, possible 
and determined, he said. The solution is given by Poisson’s formula 


a[U](t) 
ot 


+ [U'](t), 


u(x, y,z,t) = 


where [U] and [U’] are the mean values of U and U’ on the sphere centre (x, y, z) 
and radius f. 

But, said Hadamard, one should not infer that the Cauchy problem for the equation 
is always possible and determined. That is the case when tf is taken as the principal 
variable, but it is false for the same equation not when the principal variable is x, y, or 
z. For example, taking it with respect to x and with functions uo and u’p independent 
of t problem (B) now reduces to problem (A) and is therefore impossible in general. 

For example, taking x as the principal variable, the Cauchy problem asks for the 
solution for x > 0 such that when x = 0 


ou ; 
u=U and —=U*, 
ox 
U and U’ being given functions of y, z, and t. Suppose, he said, that the functions U 
and U’ are independent of f. In this case, if the solution is unique then u will certainly 
be independent of t, but this reduces problem (B) to problem (A), which we have 
just seen is impossible. 
Could there, instead, be infinitely many solutions u that take the same value on 
x = 0 and for which also the values of ou are the same? If so, then there is a solu- 
tion u of problem (B) that is not identically zero but vanishes on x = 0 along with 
ou Any such solution can be defined in the region x < 0 by the simple formula 
u(—x, y,Z,t) = u(x, y, z, t). A consideration of the implications of Poisson’s for- 
mula allowed Hadamard to prove that the only solution of (B) that on x = 0 satisfies 
uo = 0 = ug is the function that vanishes everywhere. Therefore the Cauchy prob- 
lem in this case cannot be indeterminate and it follows that it is in general impossible. 
Hadamard concluded that the Cauchy problems for t = 0 and x = 0 are very 
different, and the problem in the second case is much closer to the theory of equations 
with imaginary characteristics. 
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29.4 The Cauchy Problem 


Hadamard began his book by introducing Cauchy’s use of boundary conditions when 
solving a second-order, linear partial differential equation, and he concluded the first 
chapter by writing (§14): 


The result of Cauchy’s and Sophie Kowalewsky’s analysis would therefore be that Cauchy’s 
problem has one (and only one) solution every time the surface which bears the data is not 
characteristic, nor tangent anywhere to a characteristic. (emphasis Hadamard’s) 


But then he immediately went on in Chapter Two to say that in fact the true situation 
was not so simple, and indeed was almost paradoxical. 


The reasonings of Cauchy, S. Kowalewsky and Darboux, the equivalent of which has been 
given above, are perfectly rigorous; nevertheless, their conclusion must not be considered as 
an entirely general one. The reason for this lies in the hypothesis, made above, that Cauchy’s 
data, as well as the coefficients of the equations, are expressed by analytic functions; and the 
theorem is very often likely to be false when this hypothesis is not satisfied. [...] Indeed, one 
of the most curious facts in this theory is that apparently very slightly different equations 
behave in quite opposite ways in this matter. 


To defend his position, he compared the way Cauchy data and Dirichlet data work. 
There are occasions when Cauchy’s approach to a second-order partial differential 
equation, which involves specifying initial data in the form of values of the solution 
function and its first derivatives on a hypersurface is valid without any requirement of 
analyticity. (Hadamard defined an analytic function on an interval as one admitting 
a power series expansion.) 

In contrast, the Dirichlet problem for a region requires only that the boundary 
values of the solution function be specified. For a region V bounded by a surface S 


It is a known fact that this problem is correctly set: i.e. it has one (and only one) solution. 
This fact immediately appears as contradictory to Cauchy—Kowalewsky’s theorem: for, if the 
knowledge of numerical values of u, at the points of S (together with the partial differential 
equation) is by itself sufficient to determine the unknown function within V, we evidently 
have no right to impose upon u any additional condition, and we cannot therefore, besides 
values of u, choose arbitrarily those of gu 


To understand the deep reason for this, Hadamard noted but set aside the fact that 
the data in a Cauchy problem and a Dirichlet problem are specified on topologically 
distinct regions. He thought it more important that Cauchy data can only supply a 
solution in a neighbourhood of the hypersurface, whereas the Dirichlet data leads to 
a solution valid throughout the enclosed region. 

He then argued that in fact if data on even a small part S of a hypersurface is non- 
analytic there will be no solution of Laplace’s equation valid in a neighbourhood 
of S. For if there were there would be harmonic functions defined on either side of 
S with the same normal derivatives at each point of S. But this means that the two 
harmonic functions are analytic extensions of each other, and so their values on S 
must be analytic, contrary to assumption. 
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The way out of this paradox would come, he suggested ($15), by following 
Poincaré’s advice, for 


No question offers a more striking illustration of the ideas which Poincaré developed at the 
first International Mathematical Congress at Zurich, 1897 (see also La Valeur de la Science, 
pp. 137-155), viz. that it is physical applications which show us the important problems we 
have to set, and that again Physics foreshadows the solutions. 


Hadamard then gave a more detailed and technical examination of the nature of 
boundary data, before returning to his broad theme. This was that theorems proved for 
analytic functions may not be true when more general types of function are considered 
and that the physical interpretation of the problem is a sure guide to whether Cauchy 
data or Dirichlet boundary data are appropriate. He strongly suggested ($18) that 


This remarkable agreement between the two points of view appears to me as an evidence that 
the attitude which we adopted above — that is, making a rule not to assume analyticity of data 
— agrees better with the true and inner nature of things than Cauchy’s and his successors’ 
previous conception. 


There then followed one of Hadamard’s more famous observations, that is worth 
savouring for its own sake. He had in mind a theorem of Weierstrass’s that ensured 
that any continuous function may be approximated arbitrarily well by an analytic 
function. This being the case, why not replace a non-analytic partial differential 
equation and non-analytic data with very good analytic approximations? Surely this 
will produce arbitrarily good approximations to the solution of the original non- 
analytic problem? 

Hadamard remarked ($18): 


Ihave often maintained, against different geometers, the importance of this distinction. Some 
of them indeed argued that you may always consider any functions as analytic, as, in the 
contrary case, they could be approximated with any required precision by analytic ones. 
But, in my opinion, this objection would not apply, the question not being whether such an 
approximation would alter the data very little, but whether it would alter the solution very 
little. It is easy to see that, in the case we are dealing with, the two are not at all equivalent. 


Let us take the classic equation of two-dimensional potentials 


anu is aru 9 
ax? ay?” 


with the following data of Cauchy’s 


(15) uO, y) = 0, 


ou : 
(0, y) =ui(y) = An sin(ny), 
Ox 


n being a very large number, but A, a function of n assumed to be very small as n grows 
very large (for instance A, = n~?). These data differ from zero as little as can be wished. 


The Dirichlet problem with the boundary data 


ou 
u(0, vy) =0, —(O,y)=0 
Ox 
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Fig. 29.2. The graph of 
4 sin(ny) sinh(nx), —a < 
y<7,0<x <2,n=10 


has the unique solution u(x, y) = 0. As for the new problem, with boundary data dif- 
fering from zero by an arbitrarily small amount, Hadamard continued (see Fig. 29.2): 


Nevertheless, a Cauchy problem has for its solution 
An. , 
u = — sin(ny) sinh(nx), 
n 


which, if A, = i, a ; eV" is very large for any determinate value of x different from zero 


on account of the mode of growth of e”* and consequently sinh(nx). 

In this case, the presence of the factor sinny produces a “fluting” of the surface, and we 
see that this fluting, however imperceptible in the immediate neighbourhood of the y-axis, 
becomes enormous at any given distance of it however small, provided the fluting be taken 
sufficiently thin by taking n sufficiently great. 


After some more technical matters Hadamard then observed 


21. Another paradoxical consequence furthermore appears if we consider things from the 
concrete point of view. 


Strictly, mathematically speaking, we have seen (this is Holmgren’s theorem) that one set 
of Cauchy’s data uo, uw; corresponds (at most) to one solution of [Laplace’s equation], so 
that, if these quantities uo, uw; were “known,” u would be determined without any possible 
ambiguity.” 

But, in any concrete application, “known,” of course, signifies “known with a certain approx- 
imation,” all kinds of errors being possible, provided their magnitude remains smaller than 
a certain quantity; and, on the other hand, we have seen that the mere replacing of the value 
zero for u;, by the (however small) value (15) changes the solution not by very small but 
by very great quantities. Everything takes place, physically speaking, as if the knowledge of 
Cauchy’s data would not determine the unknown function. 


This shows how very differently things behave in this case and in those which correspond 
to physical questions. If a physical phenomenon were to be dependent on such an analyt- 
ical problem as Cauchy’s for V7u = 0, it would appear to us as being governed by pure 
chance (which, since Poincaré, has been known to consist precisely in such a discontinuity 
in determinism) and not obeying any law whatever. 


?Holmeren published this theorem in [146]. 
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After having been led by physical interpretation to the need of the above distinctions, we 
must now try to formulate them analytically. This is subordinate to the classification of linear 
partial differential equations of the second order into different types. 


These are the hyperbolic, parabolic, and elliptic types, but, because Hadamard 
always emphasised the importance of working in any number m of variables, he 
distinguished among the hyperbolic types between those in which all but one of the 
m squares have the same sign—which he called the normal hyperbolic type—and 
the others. 

He then observed that the normal hyperbolic type is the only one known in which 
Cauchy’s problem can be correctly set, and the non-normal hyperbolic types are not 
known to be connected to any physical problem and do not lead to any problem 
known to comply with Cauchy’s condition. Finally, for reasons Hadamard explained 
later in the book, elliptic equations never lead to correctly set Cauchy problems. 


29.4.1 Commentary and Concluding Remarks 


Hadamard’s work, which we have done little more than sample here, established 
three things. First, that any theory of partial differential equations deals not only 
with a differential equation but with some boundary or initial conditions. Second, 
that elliptic and hyperbolic equations are very different in this respect (and that 
parabolic equations exhibit some features of each type). Third, that there is a class of 
partial differential equations that are what he called well posed: they have solutions, 
these solutions are unique, and they depend continuously on the initial data and any 
parameters that enter the problem. 

The first point makes clear what Kovalevskaya seems to have suspected, and 
Riemann quite likely understood, that the solution of a partial differential equation 
is not some general expression that is made precise when some extra information is 
supplied (as in the theory of ordinary differential equations). This is a natural view, 
it was held by Euler and Lagrange, and it is ultimately shallow. The theory of partial 
differential equations is instead a dialogue between the equation and its boundaries. 

The second point was surely understood by Riemann, but Hadamard’s insight can 
be amplified here. Solutions to a hyperbolic partial differential equation propagate 
at a given speed that reflects some aspect of the situation being described; solutions 
to elliptic equations propagate instantaneously. The other side of that coin is that the 
solution to an elliptic equation at a point depends on all the boundary values, but the 
solution to a hyperbolic equation at a point depends only on the nearby boundary 
values. 

The third point is Hadamard’s most original. He believed that the partial differen- 
tial equations that arise in science are well posed and this is why they can be profitably 
studied, and that problems that are not well posed (ill-posed, as they are called) are 
likely to be both difficult and artificial. Although it is true that today some naturally 
occurring ill-posed problems are studied, Hadamard’s observation is deeper than it 
looks and may yet have useful things to say. 
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29.5 Exercises 


Questions 


1. Hadamard’s remarks brought finally into light a fundamental failure of the first 
generations of people who studied partial differential equations, in that he showed 
that these equations cannot be studied without the accompanying boundary con- 
ditions. To what extent does this course suggest that boundary conditions were 
initially almost ignored (they were to be fitted in after the equation was solved), 
then incorporated, then appreciated (and given equal status with the equation)? 


Chapter 30 ®) 
Revision Cheek for 


30.1 Revision and Assessment 3 


This chapter is given over to revision and discussion of the final assignment, see H.4. 

However, I would like to repeat my recommendation that students read the essay 
[156], which restates and reinvigorates many of the concerns that surfaced towards 
the end of the nineteenth century in partial differential equation theory. In particular, 
there is a stimulating return to the concerns that animated Hadamard and Poincaré. 
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Chapter 31 M®) 
Translations Beis 


31.1 Cauchy: Note on the Integration of First-Order Partial 
Differential Equations in Any Number of Variables 


This is [34], in Oeuvres (2) 2, 238). 

Until now there has been no treatise on the differential and integral calculus where 
one is given the means to integrate completely partial differential equations of the 
first order in any number of independent variables. Having been occupied for several 
months with this object, I was happy to have obtained a general method appropriate to 
fulfilling this desire. But, having finished my work, I learned that M. Pfaff, a German 
geometer, had been led on his side to the solution of the equations mentioned above. 
As this concerns one of the most important questions in the integral calculus, and M. 
Pfaff’s method is different from mine I believe that geometers will not be without 
interest in a short analysis of one and the other. I will first expound the method that 
I have used, profiting, in order to simplify the exposition, from some remarks made 
by M. Coriolis, an engineer at Ponts et Chaussées, and some others that have since 
occurred to me. 

Suppose in the first place that we are to integrate a first-order partial differential 
equation with two independent variables. One already has several methods for inte- 
grating an equation of this kind, of which one (due to M. Ampére) is based on the 
change of a single independent variable. The method that I propose, based on the 
same principle as in the admitted hypotheses, reduces to this: 

Let 


ff, y,uU, p,q) =90 (31.1) 


be the given equation, in which x and y denote the two independent variables, u an 
unknown function of these two variables, and p, q the partial derivatives of u relative 
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to the variables x and y. In order to completely determine the sought-for function 
u it is not enough to know that it must satisfy Eq. (31.1); it is also necessary that it 
satisfies another condition, for example, that it yields a certain particular value for a 
function y for a given value of the variable x. Let us suppose in consequence that the 
function u must receive, for x = xo, the particular value g(y): the function g or the 
partial derivative of u relative to y, will on this hypothesis receive the particular value 
y’(y). On the same hypothesis the general value of u is, as one knows, completely 
determined. It now remains to calculate this value: one can proceed in the following 
manner. 

Let us replace y by a function of x and a new independent variable yo. The 
quantities u, p, gq, being functions of x and y, become themselves functions of x and 
yo; and on differentiating on this supposition,! 


ou dy 


—= aay 31.2 
Ag ee ea, (31.2) 
a a 
ag (31.3) 
dY0 dYo 


If one takes one of these two equations from the other, after differentiating the first 
with respect to yo and the second with respect to x, one finds that 


dp _ dq dy dy 0q 


= : (31.4) 
ayo Ox Oyo Ox Oyo 
If one also writes the total differential of the first member of Eq. (31.1) as 
Xdx + Ydy + Udu+ Pdp + Qdq =0. 
one finds, on differentiating this equation with respect to yo, that 
) 0 7] 7] 
PA ap he (31.5) 
Yo dYo dYo dYo 
and consequently, in view of Eqs. (31.3) and (31.4) that 
a a 0 ) 
PogueP— )—-4[ Gar | eh (31.6) 
ax] dyo dx J Oyo 


Let us now observe that the value of y as a function of x and yo being entirely arbitrary 
one can dispose of it in such a way that it satisfies the differential equation 


3 
pe 


Q- ay 0, (31.7) 


'See the comment at the end of the translation. 
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and reduces to yo on the particular supposition x = x. The value of y at x and yo. 
being chosen in the way just described, the particular values of u and g corresponding 


to x = Xo, that is to say y(y) and g’(y), become, respectively g(yo). and y'(yo). 
Representing these values by uo, go one will have 


uo = (yo), Go=¥'(X0)- (31.8) 


As for formula (31.6), it is reduced by Eq. (31.7) to 


aq\ a 
(vy+qu+pt) > - 


ax] dyo ~ 
and as, y depending on yo by hypothesis, ~ cannot be constantly zero, the same 
formula becomes P 
Y+qu+P— =0. (31.9) 
x 


This done, the integration of Eq. (31.1) is reduced to the following question: 
Find for y, u, p, g four functions of x and yo, which satisfy the Eqs. (31.1), 31.2), 
(31.3), (31.7), (31.9), and of which three, namely, y, uv, g, reduce, respectively, to 
yo, Ug, go on the supposition x = xo. 

We do not speak of Eq. (31.4), because it is a necessary consequence of Eqs. (31.2) 
and (31.3). As for the particular value of p corresponding to x = xo, it will not enter 
into the general values of y, u, p, g determined by the preceding conditions. If one 
denotes it by po it will be deduced from the formula” 


Ff (Xo, Yo, Uo, Po» qo) = 0. (31.10) 


It is essential to remark that the general values of y, u, p, g as functions of x and 
yo, remain completely determined if, among the conditions that they must satisfy 
one fails to take account of the verification of Eq. (31.3). This last condition must 
therefore be an immediate consequence of all the others. To show this, let us suppose 
for a moment that the other conditions having been verified, the two members of 
Eq. (31.3) are unequal. The difference between these two members can only be a 
function of x and yo. Let a be this function and ap be what it becomes when x = Xo. 
One will have 


= a a0 = FO go = 80) - 9'O0) = 0. BLL) 
C 


Consequently, instead of Eqs. (31.3) and (31.4), one finds 


Cauchy used the same subscript notation for the initial conditions that he had used for the new 
variable, but the confusion this causes is slight. 
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ou dy ap dq Oy dydq . da 
— qd — 
Ayo dYo 


(A112) 


a, 


dyo dx Ayo Ox Ayo dx’ 


then, instead of (31.6) the following: 


dp\ dy dy \ oq da 

Y+qu+P +(Q-P + Ua+ P— =0. (31.13) 
dx] dyo dx] dyo Ox 

This last equation will reduce, by Eqs. (31.7) and (31.9), which one supposes verified, 

to 


0a 
Uat+ P—=0. (31.14) 
Ox 


On integrating it, and treating g as a function of x and yo, one will find 
a = age f FQ), (31.15) 


and consequently, taking account of the second of the Eqs. (31.11), one will generally 
have 
a=0. (31.16) 


The two members of Eq. (31.3) cannot, therefore, be unequal on the admitted 
hypothesis. One must conclude from this that the quantities y, u, p,q satisfy all 
the conditions required if these quantities, considered as functions of x satisfy 
Egs. (31.1), (31.2), G1.7), (31.9), and if in addition y, u,q reduce, respectively, 
to yo, Uo = (yo), and go = y' (yo) for x = xo. It is useless to add that on the same 
supposition p must obtain the particular value po; in fact this value will not be con- 
tained in the integrals of Eqs. (31.1), (31.2), (31.7), (31.9), because none of these 
equations contain ap 

If, in Eq. (31.2), one substitutes the value of ay drawn from Eq. (31.7), one will 


find 
ou Qq _Pp+Qq 
p+ — = —_—. 


ou 31.17 
xe BP P ehh) 


Furthermore, if one differentiates Eq. (31.1) with respect to x, one obtains the fol- 
lowing: 


MAY oy Srp So. (31.18) 


which the values of a ou aa drawn from Eqs. (31.7), (31.17), and (31.9) reduce to 
ap 


X+pU+P 
Ox 


= 0. (31.19) 


This done, one can substitute Eq. (31.17) in Eq. (31.2), and Eq. (31.19) in one of 
Egs. (31.1), (31.17), (31.7), (31.9). If besides one observes that, in the case where 
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one considers y, uv, p, g as functions of x alone, one can include Eqs. (31.7), (31.9), 
(31.17), (31.19) in the algebraic formula? 


dx dy du dp dq 


PQ Pp+Qq  X+pU Y+qU’ 


(31.20) 


one definitively concludes that to determine the sought-for values of the quantities 
y,Uu, p,q, it is enough to work with four of the five equations contained in the two 
formulae 


dx dy du dp dq 
f(, y,u, p,q) =9, = = — = : 
P Q Pp+Qq X + pU Y+qU 
(31.21) 
and to know, for x = xo the particulars yo, uo, Po, go, for the three latter ones are 
determined as a function of the first by Eqs. (31.8) and (31.10). 


Suppose, to fix ideas, that by means of the equation 


fH, y,U, p,q) =0 
one eliminates p from three equations in the formula 


dx dy du dq 


PQ Pp+Qq  Y+quU 


(31.22) 


On integrating the last three, on will obtain three finite equations that involve, with 
the quantities 


X,Y,u,qd 


the particular values represented by 


X0, Yo: P(V0), Y' (0): 


If after the integration one eliminates g, the remaining two equations involve, with 
the quantities x, y, u, and the constant quantity xo, only the new variable yo, the 
elimination of which can only be carried out when one has assigned a particular 
form to the arbitrary function denoted by gy. Whatever it may be, the system of two 
equations with which we are concerned can always be considered as equivalent to 
the general integral of Eq. (31.1). 

As, in all that has been done so far, one can substitute the variable x for the variable 
y, and reciprocally, it follows that the integrals of Eqs. (31.21) again furnish a solution 
of the question proposed, if one in the integrals one considers yo as constant, xo as a 
new variable that one must eliminate, and uo, Po, go as functions of this new variable 
that are determined by equations of the form 


3Cauchy mistakenly wrote X + PU for X + pU. 
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uo = (x0), Po = ¢' (Xo) (31.23) 
f (Xo. Yo, Uo, Po, Jo) = 9. (31.24) 
Let us apply the principles we have just established to the solution of the partial 
differential equation 
pq—xy=0. (31.25) 
[The extract from Cauchy [34] ends here. ] 


Cauchy showed that in this case that Eqs. (31.21) become 


1 
pdx =qdy= ra = xdp = ydq. (31.26) 
These imply that 
d dx d d 
Po Ma” ga Paes J ayay: (31.27) 
P x 4 y x y 


then on integrating and taking note of the condition pogo = xo yo 


= =e (31.28) 


P0,.2 2 90 
u—uy= (x xo) = 
X0 a) 


x 
(y* 8) = 2? x) = 5-0" yg). (31.29) 


In these equations, he said, xo is an arbitrary constant and yo a new variable that one 
can only eliminate after fixing a value for the arbitrary function g. Finally Cauchy 
deduced that the general integral is represented by the equations 


(u — p(xo))” = (x? — x2)? — y@), (u — G(%0)) G' (x0) = x0(y — 92), 


and he noted that the second of these equations is the derivative of the first with 
respect to x9. 

Cauchy concluded his paper with the observation that the method worked without 
essential change when there were more than two independent variables, and illus- 
trated his point by going over the method in the case of three independent variables. 
He also worked through the example 


pqr = xyz. 
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Comment The change of variables argument at the start of Cauchy’s paper may be 
easier to follow on introducing new variables s and t, where 


S=x,t=t(x,y), so x=s and y= y(s,f%). 


This means that x, = | and x, = 0. At the end, restore x = s, yo = ¢. 


31.2 Riemann’s Lectures on Partial Differential Equations 
and Physics 


Riemann lectured three times on physics at Géttingen: in 1854/55, in 1860/61, and 
in 1862. After Riemann’s death his former pupil Karl Hattendorff edited Riemann’s 
notes (mostly from the course of 1860/61) and published them as a book [237].‘ In the 
preface, he noted that while mathematicians drawn to the theory of partial differential 
equations took their lead from Dirichlet, as indeed Riemann had done, what was here 
was not restricted to potential theory but included a slew of applications. 

The book became the principal introduction to mathematical physics for over a 
generation, because it was taken up and re-edited by Heinrich Weber, and “Riemann— 
Weber’, as it came to be known, grew to two volumes. 


31.2.1 Riemann, Introduction to Partial Differential 
Equations 


The object of these lectures is the treatment of partial differential equations and their appli- 
cation to physical questions. Therefore it is convenient to make some introductory remarks 
on the relationship of the theory of partial differential equations to physics. 


It is well known that a scientific physics first began with the discovery of the differential 
calculus. Since one first learned how to follow the course of natural events continuously, 
research into the connection of appearances to abstract consequences has succeeded. This 
involves two things: first the simple basic ideas with which we construct, and second a 
method with which, from the simple basic laws of this construction that concern points of 
time and space, laws can be derived for finite intervals of time and space that alone are 
accessible to observations (and can be compared to experience). 


Galileo took the first step in respect of the basic ideas, when he constructed the laws of motion 
for freely falling bodies from the operation of weight at every moment of time; he found the 
law of accelerating force, the idea of a simple cause of motion. To this step Newton added 
a second: he found the idea of an attracting centre, the idea of a simple cause of force. With 
these two basic ideas, the idea of accelerating force and of an attracting or repelling centre, 


47 have used the third edition, 1882, which Hattendorff said is a careful revision of the first edition. 
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physics is still constructed to this day. The present-day speculations of Laplace, Poisson, 
Cauchy, where the thread of observations stops, are attributable only to the struggles with 
the appearance of these two laws. In respect of the ideas that one places at the basis of the 
physical explanation of nature, we therefore take today the standpoint of Newton. No new 
step has been taken since Newton; all research into basic ideas that penetrate into the heart 
of nature have up to now failed; the influence of later philosophical systems that have been 
applied in the physical literature have only had the success of disfiguring Newton’s original 
perception with inconsistencies. 


But the method, by which the simple basic laws for moments of time and space are obtained— 
differential equations—are turned into laws for finite intervals and extended bodies, is essen- 
tially perfected. At first, after the discovery of the differential calculus, one could handle 
certain abstract cases: in the study of free fall one connected the mass of a body with its centre 
of gravity, one treated the heavenly bodies as mathematical points, in the study of pendulums 
one first treated only the mathematical pendulum i.e. a rigid movable line connected to a 
heavy point; so that one only had to take one step from the infinitely small to the finite in 
only one dimension with respect to one variable, the time. But in general, in order to derive 
the experiences from the elementary laws, one must take the step from the infinitely small 
to the finite in more than one dimension. For the elementary laws involve space and time 
points, experiences involve extended bodies. Such problems lead, to speak generally—in 
special cases the problem can be simplified—to partial differential equations. 


Sixty years after the appearance of Newton’s Principia the first physical problem was solved 
that led to a partial differential equation. It was the one that d’ Alembert showed determined 
the oscillations of a stretched string. It was then a long time until the general method was 
found by which the physical problems that lead to partial differential equations can be solved. 
For this we thank Fourier, who first applied such methods in his study of the diffusion of 
heat in solid bodies. This took almost as long from the origin of partial differential equations 
as that had from the creation of the differential calculus. Newton’s Principia appeared in 
1687, d’Alembert’s solution of the problem of the vibrating string in 1747, again 60 years 
later, on 21 December 1807, Fourier presented the first part of his work on heat to the Paris 
Academy. 


After these selective and not entirely accurate historical pages Riemann turned 
to list the many areas in physics where partial differential equations provided the 
appropriate foundations. These included oscillations in gases, liquids, and solid bod- 
ies, elasticity of bodies, and optics. He noted that most of this work involved making 
assumptions about molecules that make up these bodies, and so the determination of 
the constants that enter the partial differential equations depended on assumptions 
about the molecular composition of bodies that, he said, we were far from having the 
key to being able to do. The same was true, he went on, for gravitation, electricity, 
and magnetism: the fundamental laws involve partial differential equations. He then 
concluded: 


What then emerges as a fact by means of induction arises also a priori: the proper foundations 
for mathematical physics are partial differential equations. True elementary laws can only 
occur in the infinitely small, for space and time points. In general, such laws will be partial 
differential equations, and the derivation of laws for extended bodies and times requires their 
integration. So methods are necessary by which the finite laws can be derived from the laws 
of the infinitely small, and indeed derived with complete rigour neglecting nothing. Only 
then can they be tested against experience. 
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The book itself opens with just under a hundred pages of mathematical methods 
and twenty on the basics of ordinary and partial differential equations. Then it turns 
to a more detailed investigation of heat diffusion in solid bodies, oscillations of solid 
bodies, fluid motion, oscillations in compressible media, and finally the motion of a 
solid body in an unbounded incompressible fluid. Nothing, one notes, on magnetism 
and electricity—an omission that became the main reason for Weber’s new editions— 
but one that Riemann had addressed in another series of lectures, later published 
as Schwere, Elektricitét und Magnetismus (Gravity, Electricity, and Magnetism). 
That said, as Weber noted, Riemann’s book on partial differential equations was 
not a physics textbook but a mathematical book devoted to the solution of various 
mathematical problems. 


31.3. Extracts from Schwarz, ‘Ueber eine 
Abbildungsaufgaben’’, 1869 


[At this stage in his paper Schwarz has shown that the most general map of an 
angular sector of angle az to the upper half-plane that is given by a map that it 
analytic everywhere except at the origin, and maps the origin to itself, is one of the 
forms 

veul" ¢=Crv(lt+ayyutaqu't+---), 


where C is anon-zero constant and the coefficients a; are all real. The inverse function 
iS 1 
u=v*, Ui F (perth eat hay), 


where C the coefficients c; are all real. 

He then continued:] 

In a problem about conformal maps the position and absolute size of the figure in 
the u-plane on which a figure in the t-plane is to be represented conformally is usually 
unimportant. So the general solution of the representation problem introduces two 
arbitrary constants that determine the position and absolute size, for, if vu = f(t) is 
a function that maps a figure T in the f-plane onto a figure U in the u-plane then 
u’ = Cu + Cz is another such function, only it places the corresponding figure U’ 
in another position, is of another proportion, and can be dragged to the position of 
the figure U. So if we have to obtain the characteristic properties of a figure T on a 
figure U we must look at the dependence between the quantities u and f to determine 
which are independent of the particular position and absolute size of the figure U in 
the u-plane; that is, to determine the differential equation in whose general solution 
the constants C; and C2 enter as constants of integration. 


This leads to 
du’ _ du 
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This function is then independent of the particular position and absolute size of the 
figure U in the u-plane. 

The passage from u to iu and £ log au 
all the values of the argument f, for which the quantity ie becomes infinitely large 
or infinitely small, and £ log iu infinitely large, are singular points for the represen- 
tation problem, in that conformal representation in the strict sense cannot hold for 
them. 

In the case already considered of the conformal representation of an angle z on 


an angle az 


is all the more important a step because 


a-—1l 
log = td,+dt+-:-:-. 


This function has the character of a rational function in the neighbourhood of the 
value t = 0. The coefficients d;, d2, ... all have real values and therefore the values 
of the function g log a for those real values of the argument t for which the series 
converges, are likewise real. 

Therefore, when the problem is to conformally map the surface of a figure T 
in the t-plane onto another bounded by a simple curve (i.e. one that goes through 
no point more than once) lying entirely in the finite part U of the u-plane, then 
it is immediately assumed that the quantity a can never become infinitely small 
or infinitely large at any point in the interior of 7, and therefore that the function 
i log ie has the character of an entire function for all values of the argument f. 

In the present case the singular values of ¢ lying in the finite part of the plane are 


t=-1,f=0, t=-+15a is equal to 5. The function 


dy lca 1 gh 1 
Oo ’ 
Ct 2\t+1 t t-—l 


which for all real values of the argument likewise has real finite values, has the 
character of an entire function for all finite values of t with positive imaginary part, 
and so for all finite values of ¢ it has the character of an entire function.° For the 
infinite value of ¢ there is the development 


Ci 
u—uy = —— (Ite rt+et?t---), 
gp ee 


from which 

qd, du are eee ee 
Oo — teeny 
© di 27 (f° =p 


>This follows from the Schwarz reflection principle that Schwarz had introduced earlier in his paper. 
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so the function £ log ie is infinitely small for all infinite values of ¢, and therefore 
£ log tu is a rational function of ¢ and indeed equal to —5 (4 + t + —). From 
this it follows by integration that 


du 1 
log — = —=log4t(1 —f7) +1 
SF 5 18 ( )+logC,, 


f dt 
u=¢, ff —2+a, 
‘Jo AL 


One can easily recognise that in this case the lemniscatic integral 


ae iis dt 
i =| J4t(1 — 1?) 


represents the interior of each of the two half-planes in which the plane is divided by 


dt 
Eo Through 
the substitution s = ct one goes from the half-plane lying on the positive side of the 
real axis [i.e. the upper half-plane, JJG] to the surface of a circle of radius | drawn 
in the s-plane around the point s = 0. 


the real axis conformally on the interior of a square with sides [, e 


[Schwarz then considered a number of other cases, of which I only translate these 
two.] 


If the problem is to represent the interior of a half-plane T conformally on the 
interior of a straight-sided triangle with angles az, Bz, ym then by an analogous 
argument one deduces that the representation is provided by a function of the form 


t 
CwutC= / G26 Oa by C= er" a. 
to 


In this case the three real values that correspond to the three vertices of the straight- 
sided triangle can be chosen arbitrarily provided they follow the same order on the 
boundary of the half-plane T the angles az, Bx, yx of the triangle are encountered 
on a circuit around the interior of the triangle. 

With this result the form of the function is found that conformally represents the 
surface of a half-plane onto the simply connected surface of any straight-sided poly- 
gon. In the general case of an n-gon, only three of the n real quantities a, b,c, .. . that 
correspond to the vertices of the straight-sided polygon van be chosen arbitrarily, the 
remaining n — 3 are determined by the given ratios of the lengths of the individual 
sides of the polygon under consideration. 


[A little later in the paper, Schwarz remarked: ] 
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Concerning the problem of representing the surface of a straight-sided polygon on 
the surface of a circle I had the pleasure of seeing the researches of Herr Christoffel 
on the subject: Sul problema delle temperature stazionare e la rappresentazione di 
una data superficie, Annali di matematica, II’ serie, tomo 1°, 1867, that seem to agree 
with mine. 


31.3.1 The Schwarz—Christoffel Transformation 


It will be helpful to give a more modern account. Consider the problem of finding a 
map z+» f(z) from the upper half-plane onto a triangle with vertices at the points 
b,, bz, bz and angles at these points of a7, a27, «377, respectively. We require that 
the map sends the points @;, a2, a3 on the real axis to the points b,, bo, b3, and we 
look for a map that is holomorphic everywhere except at the points a), a2, a3. 

The map will fail to be holomorphic precisely at the points where its derivative 
vanishes, which suggests that the map must be such that 


f'() = — ay) (z — a2)5 (z — a3)§. 


Now, the map z +> z* maps the origin to the origin and an interval around the origin 
to two line segments meeting at an angle of wz, and its derivative is 


f'@=a", 
so this suggests that when mapping the half-plane to a triangle we try maps for which 
FQ) =] = ay — a)" = 3)". 


Notice that away from the pre-images of the vertices the map is locally one-to-one, 
because f’ does not vanish. So we can find the image of a domain that does not have 
a pre-image of a vertex in its interior by finding the image of the boundary of the 
domain. 

Integration produces the map 


fOx cof (a1) 1(¢ = a3)" — ag)“ + Ch. 
0 


Certainly, this map maps angular segments of z around each a; to angular segments 
of a ;7 around each of three points. But are they the points b;, and are the sides—the 
images of the segments joining each a; to the next one (via oo, if need be)—straight? 

The first question is easy, and the answer is Yes. Consider the interval (a), az). 
The integral can be written as 
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a 
i (% — a1)" — a)?" — 3)", 
a 


which is real, and so the image is part of the real axis, and therefore straight. We can 
adjust the arbitrary constants to map a, to b; and rotate the line segments around b, 
to point in the right directions, and because maps of the form z +» Coz + C; maps 
line segments to line segments, the images of the sides through b, will be straight. 
We can do this for any vertex, to the image of the upper half-plane has straight sides. 

That brings us to the first question, because everything now depends on the lengths 
of the sides. In the case of a triangle this is easy, because a triangle is determined up 
to size by its angles, and the constant Cp takes care of any potential problem. 

But what happens if we want to map the upper half-plane onto a quadrilateral, or 
more generally an n-gon with prescribed vertices and sides? Can it be done? It turns 
out that the answer to this question is also Yes, but the computation of the lengths 
of the sides of the polygon as functions of choice of positions of the pre-images of 
the vertices and the angles is difficult and cannot be discussed here.° For maps of 
the upper half-plane onto n-gons with given angles (n > 3) the positions, a;, of the 
pre-images of the vertices must also be specified and the proof that a solution can 
always be found is delicate. In particular, the length of the side with vertices f(a) 


and f(b) is given by the integral 
b 
[ Wreaz. 


Notice that the function f(z) involves all the pre-images of the vertices. It can be 
done, and the formula that does it is the natural generalisation of the above integral 
to any number of vertices is called the Schwarz—Christoffel formula. 


31.4 An Extract from Schwarz, On the Alternating Method 


(1870) [This comes from Schwarz’s paper “On a passage to a limit by an alternating 
method” [244].] 


The rigour of the well-known inference that goes under the name of the Dirichlet 
principle, and that in a certain sense must be seen as the foundation of the branch of 
the theory of analytic functions developed by Riemann, is subject, as is now quite 
generally admitted, to very well-founded objections whose complete resolution to 
my knowledge the efforts of mathematicians have not yet achieved. 

By developing some enquiries, which involve a certain kind of representation, 
and part of which I have published in vol. 70 of Borchardt’s Journal and in the paper 
“On the theory of representation” in the programme of the polytechnic school for the 


Fora complete discussion, see Nehari ([204], 189-198). 
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Winter semester 1869-70, I have been led to a method of proof by means of which 
I am convinced that all the theorems that Riemann used the Dirichlet principle to 
prove in his published works can be proved rigorously. 

The following report is essentially a summary of a work on the integration of the 
partial differential equation Au = 0, that I reported on to Herr Kronecker and some 
other mathematicians in November last year. 

It is concerned essentially only with the proof of the existence of a function u 
that on a given domain T of the independent variables x and y satisfies the partial 
differential equation 


and also satisfies certain prescribed boundary and discontinuity conditions. 

For brevity, I restrict myself here to the case in which the auxiliary conditions are 
only boundary conditions and therefore imply that the function u is always continuous 
and takes prescribed finite values on the boundary of the domain T, which consists 
of one or more continuous parts. The general case can be reduced to this case by a 
method to be described. 

For the applicability of this method of proof it is in no way necessary to assume 
that the boundary curve of T has only finitely many corners, nor that in general at 
every point it has a finite radius of curvature, an assumption that Herr Weber and Herr 
Carl Neumann made for this purpose in their researches (see Borchardt’s Journal, 
vol. 71, p. 29 and the Berichte der mathematisch-physischen Classe der Koniglich 
Sdichsichen Geselleschaft der Wissenschaften, 21 April 1870). At no point will the 
tangent to the boundary curve be assumed to vary continuously; rather, it is enough to 
know that the boundary curve can be divided into a finite number of pieces such that 
in the interior of each piece the change in the direction of the tangent is always in the 
same sense even though it may also have infinitely many jumps, and so, therefore, 
the boundary curve can have infinitely many corners. 

Cusps on the boundary curve are also not excluded. I have carried out the analysis 
of such cusps, which arise from the contact of two analytic curves that have the 
character of algebraic curves in a neighbourhood of the point of contact; but to avoid 
unnecessary complications here no reference is made in what follows to the presence 
of cusps. 

The success of the proof whose basic idea is reported here rests in the last analysis 
on the following lemma: 

The boundary line of the domain T for which it is possible to integrate the partial 
differential equation Au = 0 with arbitrary boundary conditions, will be divided 
into a finite number of segments (parts). These can be arranged in two groups in such 
a way that each group contains at least one segment. One can give the individual 
segments, according as they belong to the first or second group, an odd or even 
number and denote the points that separate the segments with an even number from 
those with an odd number by P. In the interior of T one considers a finite number of 
analytic curves L that have either no point or only an end point P in common with 
the odd-numbered segments and are not tangents at these points. 
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In this way we determine a function u for the domain T that satisfies the partial 
differential equation Au = 0 and at all points of the boundary of T has the value 0 
or | according as the number of the segment in the interior of which the given point 
lies is even or odd. Then the upper bound, respectively, the maximum of all values 
that the function u takes on the curves L, is a positive number g that is less than 1. 

We now determine a function uw, that satisfies the partial differential equation 
Au, = 0 on the same domain 7, with the same division of the boundary into odd 
and even numbered segments and the same curves L that takes the value 0 on the 
even numbered segments of the boundary, and on the odd numbered segments takes 
arbitrarily prescribed values that do not exceed a quantity g in absolute value; so 
the absolute values of the values that the function u; can take on the curves L never 
exceeds the value gq, where qg has the previously ascribed significance and so is less 
than 1. 

For the surface of the circle, and for all simply connected surfaces that are known 
to be conformal images of the circle, the integration of the partial differential equation 
Au = 0 with prescribed boundary conditions presents no difficulty. In this regard 
the task may be treated as in a paper in this journal (pp. 113-128 of the current year); 
there, breaks in the continuity for the function u in the prescribed series of values for 
the function u are exceptionally excluded; in this way the inference there developed 
can, mutatis mutandis, also be given if a finite number of boundary points in the 
series of boundary points are subjected to a break in continuity. 

After it is shown that for a number of simpler domains the differential equation 
Au = 0 can be integrated for arbitrary boundary conditions, the proof has to be 
found to show that for a less simple domain that is composed of these in a certain 
way the differential equation is also possible with arbitrary boundary conditions. 
For the proof of this theorem a limiting argument can serve that has a great analogy 
with a two-chamber air pump used to produce an evacuated space. The periods of 
the operation consist indeed in that in one and the other case involve two alternately 
operating single operations, which indeed have the same purpose, but are not identical 
in respect of the way and manner in which they work, but are rather in a certain sense 
symmetric (Fig. 31.1). 

Such a limiting argument may be called a limiting argument by an alternating 
method. 

Let two domains 7, and 7) be given which have one or more domains 7* in 
common, and whose boundary lines are not tangent. (In the schematic figure 10 7, 
is the surface of a circle, T> the surface of a square.) 

The totality of all parts of the boundary of 7; that lie outside 7 will be denoted 
Lo, the totality of all remaining parts of the boundary of 7, that lie inside 7) will be 
denoted L>. 

Likewise the boundary of 7, divides into the two parts L,; and Ls, if indeed the 
totality of all pieces of the boundary that lie inside 7; will be denoted by L,, and the 
totality of all parts of the boundary that lie outside 7; will be denoted L3. 

It will be assumed that equally for the domains 7; and T> it is possible to integrate 
the partial differential equation Au = 0 with arbitrary boundary conditions; it then 
remains to show that this is also possible for the domain T; + 7, — T* = T that has 
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Fig. 31.1 Schwarz’s Figure 
10. Schwarz, Gesammelte 
mathematische 
Abhandlungen, vol. I, p. 136 


the domains T; and 7> as parts and in which their common domain 7* is counted 
only once. 

The conditions of the previous lemma are satisfied as well for the domain 7, and 
the curve L as for the domain 7, and the curve L»; in the first case the curve Lo and 
in the second the curve L3 may be taken as the location of the group of segments of 
even order. It is therefore possible to determine two numbers gq, and q2 that play the 
role of g in the lemma and are therefore both less than 1. 

To the recipients of the air pump corresponds—maintaining the above analogy— 
the domain 7%, to the interior of two pump cylinders correspond the domains the 
domains T; — T*, T — T*, the vents to the curves L; and L>. 

On the boundary of 7, thus along Lo and L3, let the values of the function be 
given arbitrarily; let g be the upper bound and k the lower bound of these values; the 
difference g — k will be denoted G. 

Now one takes along L2 an arbitrary sequence of values, for example, the value 
k at every point of L2, and determines for the domain 7; a function wu, that takes the 
prescribed values along Lo, has the value k along L2 and satisfies the differential 
equation Au; = O in the interior of 7;. By the assumption about the domain 7; there 
is such a function. (First push of the first piston.) 

The values that the function wu; has along L one thinks of as fixed, and determines 
a function uz for the domain 7> that along L3 has the prescribed values and agrees 
with the previously determined function uw; along L, and for which Auz = 0. By the 
assumption about the domain 7) there is such a function. (First push of the second 
piston.) 

The value of u2 — wu, or uz — k along L> is smaller than g —k = G. 

One determines for the domain 7, a function w3 that takes the prescribed values 
along Lo, has the value uz along Lz and for which Au3z = 0 in the interior of 7}. 
(Second push of the first piston.) 

At no point in the interior of T; is the difference u3 — u; negative; in absolute 
value the difference u3 — uw; is less than G along L, but by the earlier lemma less 
that Gq), because u3 — u, had the value 0 along Lo and along L> it is smaller than 
G. 

The values that the function w3 has along L one thinks of as fixed, and determines 
a function u4 for the domain 7> that along L; agrees with u3 and along L;3 has the 
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prescribed values and for which Au, = 0. By the assumption about the domain 7) 
there is such a function. (Second push of the second piston.) 

The difference u4 — u2 along L; has the value 0, and along L,, where it agrees 
with u3 — uw, itis positive and smaller than Gq ; therefore u4 — u2 is never negative 
in the interior of T> and is always less than Gq, but along L2 smaller than Gqiq2. 

By continuing this alternating method one obtains a sequence of infinitely many 
functions with odd and even index. The ones for the domain 7; and the others for 
the domain 7, are so explained that, respectively, along Lo and L> they have the 
prescribed values and in the interior of the domains on which they are explained they 
satisfy the partial differential equation Au = 0. 

For the domain 7* the functions with odd and even index are explained and indeed 
they agree with each other alternately along L; and L2. Indeed along Ly urn) = Ur, 
and along Ly Uon41 = Ur. 

It is now not difficult to prove that the functions with odd and even index approach 
definite limiting functions u’ and u” as the index increases, as is shown by the 
equations 


uw’ = uy + (U3 — W1) + (Us — 3) Fe + (ong — Uan—1) $ +++, 


I 


u" = uz + (ug — U2) + (Ug — U4) +++ > + (Wong2 — Yan) +°°>. 


The series on the right-hand side converge unconditionally and uniformly (“in gle- 
ichem Grade’’) for all pairs of values of , y under consideration, indeed 


(W2n41 — U2n—1) < G(qigz)" | and 


(U2n42 — Urn) < G(qiqr)” ‘41. 


Along L, as well as along L, u’ =u”. In the interior of 7, Au’ = 0, in the 
interior of T, Au” = 0, therefore at every point of T* u’ = u”, because along the 
entire boundary of T* both functions agree with each other. 

Therefore both functions u’ and wu” are values of the same function u, and it is 
explained that for the interior of the whole domain T = T; + T, — T* the partial 
differential equation Au = 0 is satisfied and takes the prescribed values along the 
boundary Lo + L3. 

Thus the proof of the correctness of the above considerations established: Under 
the given conditions it is possible for the domain T to integrate the partial differential 
equation with arbitrary prescribed boundary conditions. 

By repeated application and suitable modification of the explained limiting pro- 
cess of the alternating method the existence of a function uw on a given domain can 
be established also for boundary conditions with discontinuities, or the prescribed 
discontinuities such as Abelian integrals possess, for which Riemann required the 
existence in his work and sought to prove using the Dirichlet principle. 

The outlined method of proof extends not only to the case in which the domain T 
is represented geometrically as a simply or multiply connected Riemann surface in 
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its entire extension in the plane or the surface of a sphere, but is essentially unaltered 
also in the case in which this surface is spread over one or many plane or spherical 
surfaces and polyhedral surfaces. 

By means of this extension can the proof be given, among other things, that 
a simply connected domain spread over a polyhedral surface can be conformally 
represented on the surface of a circle if this surface has a closed boundary curve and 
on the surface of a sphere if it is a simply connected and closed domain. 

In this way the question is answered of the possibility of determining the constants 
to which the conformal representation of a simply connected covering surface of 
polyhedron bounded by plane figures on the surface of a sphere can be reduced (see 
Borchardt’s Journal, vol. 70, p. 119). 

A special case of the just-mentioned problem occurs when it is required to map 
a simply connected surface in the form of a plane figure with polygonal boundary 
conformally onto the surface of a circle, where the surface of the polygon may lie 
entirely in the finite or contain the infinitely distant point once or several times in its 
interior; branch points in the interior are also not excluded. For this problem the sole 
difficulty consists in the proof of the possibility that on a certain number of parts real 
and on some parts complex conjugate constants can be determined, upon which the 
function providing the conformal representation depends, so that all the conditions 
of the problem are satisfied. 

This difficulty can be overcome by the method that Herr Weierstrass has devel- 
oped. The application of the above limiting process offers a new way to overcome 
it. 

Similarly, the proof is provided of the possibility of determining the constants 
to which the problem of the conformal representation of a simply connected figure 
bounded by circular arcs upon the surface of a circle is reduced. 


[The later Nachtrag, a dispute with Christoffel on conformal representation, is not 
translated. | 


31.5 Schwarz on the Hypergeometric Equation (1873)—A 
Summary 


Report on those cases in which the Gaussian hypergeometric equation F(a, 6, y, x) 
is an algebraic function of its fourth element. Schweizerischen Naturforschenden 
Gesellschaft, 1871, 74-77, (session of 22 August 1871), in Gesammelte Mathema- 
tische Abhandlungen 2, 172-174. 

The problem of studying when a given (ordinary) algebraic differential equation 
has a particular algebraic solution, and, if this is the case, of finding all its particular 
algebraic solutions, still belongs today to the most difficult problems in analysis. 
It seems, in the present state of knowledge, that the problem must be tackled in 
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isolated cases with the help of such methods as are appropriate to the special cases 
under consideration. 

For the second-order linear ordinary differential equation that the Gaussian hyper- 
geometric equation F(a, 8, y, x) satisfies, considered as a function of its fourth 
element, the following train of thought leads to a complete solution of the given 
problem. 

The general solution of the given differential equation can, as it is easy to see, 
only be an algebraic function of the independent variable x if the first three elements 
a, B, y are real and indeed rational numbers. If on the assumption that these con- 
ditions are met, one considers in addition to the general solution of the differential 
equation the quotient of two linearly independent particular solutions, then this latter 
is related to the general solution in such a way that either both are algebraic functions 
of the argument x or neither of the two functions depend algebraically on the quantity 
a 

The independent variable x is an unrestricted variable quantity that can take all 
real and complex values. If one now thinks of the plane, whose points represent 
the values of the complex quantity x geometrically, as divided by the real axis into 
two half-planes, and considers the conformal representation that is provided by a 
branch s of the above-mentioned quotient as a function of the complex variable x, 
then there corresponds to each of the two half-planes a figure whose points represent 
geometrically the values of the complex quantity s; one, generally, a domain bounded 
by three circular arcs and which can therefore be called a circular-arc triangle. 

By analytic continuation of the branch of s under consideration there arise in 
general infinitely many circular-arc triangles in the plane of the complex variable s, 
and indeed every neighbouring two of them have a side in common. If, in a special 
case, this side is straight, then both triangles correspond one to the other in the 
usual way, i.e. they are symmetrical figures with respect to the line. If however as a 
consequence of the development the common side is a circular arc—and this is the 
general case—then in place of the usual symmetry a Mobius circular transformation 
occurs, and indeed the circle which the common arc belongs, is the directrix of this 
transformation. This relationship can rather be called symmetry with respect to an 
arc. 

Through these considerations the problem that is to be solved, that was originally 
function-theoretic, reduces to the following geometric one: Find all circular-arc tri- 
angles that on being multiplied by this symmetry law occupy only a finite number 
of positions and have the form of various different circular-arc triangles. 

Through geometric arguments one now finds that the number of different sym- 
metric repetitions of a circular-arc triangle can only be finite when it is possible to 
map this triangle conformally onto the surface of a sphere so that it corresponds to 
a spherical triangle. Since now for a spherical triangle all corresponding repetitions 
are either symmetric figures in the strict sense or congruent figures, in this way the 
question is reduced to the following purely geometric problem: 

“A body has only a finite number of symmetry planes: find all of the different 
positions these can have’”’. 
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This problem, already solved by Steiner, leads either to a family of n planes with 
a common axis, in which each plane meets the next at an angle of 2 /n, and with a 
plane that cuts each plane of this family at right angles, or to the symmetry planes 
of a regular polyhedron. 

This is connection of the question “When is the general solution of the differential 
equation of the hypergeometric series F(a, 8B, y, x) an algebraic function of the 
argument x?” with the theory of regular polyhedra. 

The case in which not the general but only one particular integral of this differential 
equation is an algebraic function of the argument x—a case that is easy to analyse— 
can be set aside here. 


31.6 Darboux on the Solution of Riemann’s Equation 
(1887) 


We now follow the account given in Darboux ([58], vol. 2, Sects. 358—360).’ 


Section 358 The adjoint equation to a given linear equation was presented for the 
first time in a memoir by Riemann on the propagation of sound. [...] In what follows, 
we discuss only the equation, already studied in the previous chapter, 


a7z i Oz 
a 
ox dy ox 


ri) 
+b # tg = 0, 
dy 


[where a, b, c are functions of x and y]. We shall give Riemann’s results and indicate 
the consequences that one can deduce. One has*® 


Zz 
F = b ’ 
@) axdy Ox + ae 
a7u Ou Ou da Ob 
G(u) = b , 
Seay de ay Oe 
1 ( az du 
M=auz+=l(u—-—-z—}], 
2\ oy oy 
1 0z ou 
N=b ~(u— —z— 
ver 5 («5 = 


Riemann showed that if S is a region bounded by a simple closed curve o then 


7See also (Courant—Hilbert Vol. 2, Ch. 5, Sect. 5). 
8If we regard F as an operator, then we regard G as the adjoint operator. 


31.6 Darboux on the Solution of Riemann’s Equation (1887) 335 
[ [wre -<Guyaxay = [ (tay — vas). (31.30) 
Ss o 


Suppose that z and uw are, respectively, some solutions of the given equation and 
its adjoint, so 
F(z) =0, Gu) =0. 


Then the integral over S in Eq. (31.30) vanishes, and we therefore have 


[ota — Ndx)=0. 


Let A be an arbitrary point in the plane and B’C’ a curve placed arbitrarily in the 
plane. Draw through A the lines AB and AC parallel to the coordinate axes, and 
suppose that the solutions z and u and the coefficients of the differential equations 
and their first derivatives are continuous in ABC. The above equation gives 


Cc B A 
J may + [ (Mdy — wax) — [ Ndx =0. 
A Cc B 


Insert the above values of M and N into this equation, and one gets 


ff mere Qte—(t-a)e) 
y= y v4 au y], 
A a \2 dy dy 

B B 1 

/ Nax = f ae Zz a bu)dx). 
A A 2 0x Ox 


If quite generally one denotes by gp the value of a function g at a point P, one 


has 
= 1 . Ou 
i Mdy = ((uz)c — (uz)a) / z{| — —au)dy, 
A 2 A dy 
- 1 z Ou 
Ndy = ~((uz)p — (uz)a) z(— — bu) dx, 
A 2 A Ox 
so 
a B Cc 
(uz)q = £((u2)p+(uz)c) , (Mdy = Nax) / (= bu) ax / (= au) dy. 
2 B A ox A dy 


(31.31) 


We examine each term on the RHS. 
We imagine that, with Riemann, we have to find the solution z of the given partial 
differential equation that takes given values, along with one of its derivatives, at all 
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points of the curve B’C’. The equation 


applied to a displacement along the curve evidently determines whichever one of the 
partial derivatives that was not given a priori, so we can consider that they are both 
known at each point of the curve B’C’. It follows that if one has chosen a solution u 
of the adjoint equation, the three terms 


Cc 
Gone. i (Mdy — Ndx) 
B 


that enter the RHS of Eq. (31.31) are known and depend only on the bounding 
conditions on z. If one could calculate the latter two integrals on the RHS one would 
know za, that is to say the value of z at an arbitrary point of the plane. Now, in 
general these integrals depend on the entirely unknown values that the sought-for 
solution z takes on the line segments AB and AC. For these values not to intervene 
it is necessary that the solution u has been chosen so that 


0 

ae bu =0 everywhere on AB 
Ox 

ou 

dy —au =O everywhere on AC. 


If these two equations can be satisfied, then the fundamental Eq. (31.31) reduces to 
the following: 


B 


1 
(uz)a = 7 Ue)e + (uz)c) -{ (Ndx — May), (31.32) 
Cc 


which determines the value of z at an arbitrary point of the plane as a function of the 
boundary conditions only. 

Thus, to obtain the general solution of the equation in the form most appropriate 
for problems in mathematical physics, it is enough to find a solution of the adjoint 
equation that satisfies the two equations stated above. 

These conditions can be transformed as follows. One must have gu —bu=0 


a 
everywhere on AB. Because only x varies on this segment, one can integrate this 


equation, which gives 
M 
Um = Ua, eXp (/ bdx) 
A 


for all points M between A and B. Likewise, the second condition can be replaced 
by 


31.6 Darboux on the Solution of Riemann’s Equation (1887) 337 


N 
uy = Ua exp (/ ady) 
A 


for all points N between A and C. 

One can always reduce the constant uw, to unity, so, if the coordinates of A are 
(xo, yo) the question is reduced to finding a solution u(x, y, xo, yo) of the adjoint 
equation depending on two parameters xo, yo, that reduces to unity for x = x9, y = 
yo, taking the value exp(f~ bdx) for y = yo and the value exp(f> ady) for x = Xo. 

This is the fundamental result established by Riemann. The great mathematician 
had been able to determine a function u for the equation that he had discussed and 
which is no other than equation E(6, 6). We shall see that the determination of this 
function can also be carried out for the more general equation E(, 6’), but first 
staying with the general theory we shall add an essential remark to the result we have 
just given. 

Section 359 Suppose that the primitive curve BC reduces to two straight lines C’D 
and DB parallel to B’D and DC’ [the y and x axes, respectively], and let (x;, y,) 
be the coordinates of the point D. One will have 


B D B 
/ (Wax ~ Mdy) = | wax — [ Mdy. 
Cc Cc D 


Moreover, one can write 
Pp P/\ 0z ou 
Ndx = =f u— —z— b dx = 
i x 1 (5 («= i) + us) x 
2 1 d(uz) dz 
b dx, 
[ ( aa +u(E4 :)) P 


so 1 D fodz 
/ Ndx = = ((uz)c — (uz)p) +f u (5 + bz) dx. 
Cc 2 C Ox 


Likewise one has 


? 1 Py faz 
i Mdy = = ((uz)p — (uz)p) +f u (= + az) dy. 
D 2 B dy 


Therefore, substituting these values in the above equations, one has 


18 ae 
wz) = f u( 452 ax— [ u (Faz dy. (31.33) 
C Ox D dy 


This formula applies to every solution z of the given equation. It offers the greatest 
analogy with the general Eq. (31.32), but it is distinguished by an essential property. 


and so 
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Indeed, one recognises that it is not now necessary to prescribe one of the derivatives 
of z on the contour C’ DB’. Knowing only the values of the sought-for solution on 
the lines C’D and DB’ allows one to calculate the two integrals that the preceding 
formula contains and to obtain the value of this solution. It is necessary to look for 
the origin of this very interesting result in the circumstance that the new contour is 
formed with the characteristics of the given linear equation. 

Suppose now that one takes for z this particular solution z(x, y, ; x1, yi) of the 
given equation that is given by entirely similar conditions to those indicated for 
Z(x, y; Xo, Yo) considered as a solution of the adjoint equation. As it is necessary to 
change the sign of the coefficients a and b when one passes from one equation to the 
other, one sees that this solution must reduce 


for y = y; to exp (-/ bas) 
Xt 
y 

for x = x; to exp (- / ady) 
Jy 


and consequently to 1 for x = x;, y = y;. Therefore one has 


dz = ; 
ras bz = Oat all points of CD 


i +az=0at all points of BD 
z=latD. 


Consequently, Eq. (31.33) reduces here to z4 = up, that is to say 


Z(X0, Yos X15 v1) = U(X1, Y15 X05 Yo)- 


This equality implies the following proposition: The solution u(x, y; xo, yo) of the 
adjoint equation that we defined before can be considered as a function of the parame- 
ters Xo, yo; itis then a solution of the primitive equation (where one will have replaced 
x, y by Xo, yo) and possesses, with respect to that equation and the variables xo, yo, 
the properties by which is has been defined as a function of the variables x, y and is 
a solution of the adjoint equation. In other words, the definition of u does not change 
if one switches the linear equation and its adjoint, on condition that one switches the 
two systems of variables x, y and xo, yo. 

It follows that the determination of this function u(x, y; xo, yo) also allows one to 
integrate the adjoint equation by a formula analogous to what has been given above. 
The integration of two linear equations, the given one and its adjoint, therefore lead 
to one and the same problem, the determination of the function u(x, y; Xo, yo). This 
function can be completely defined, either as the solution of the given equation, or 
as the solution of the adjoint equation, via the boundary conditions to which they are 
subjected. 
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31.7 Picard and Elliptic Partial Differential Equations 
(1890) 


Picard showed in his [211] that these equations have regular solutions, just like the 
Laplace equation. The paper opens as follows: 


Introduction 


Let us consider a second-order partial differential equation of the form 


9 9 9 au 9 
fa 7 a gE 3 We my), (31.34) 
ax? axdy ay? ax’ dy 


A, B, C depending only on the two independent variables x and y. In order to solve 
this equation under certain specified boundary conditions, one can proceed by suc- 
cessive approximations in the following manner. In the second member we insert an 
arbitrary function u; of x and y, and form the equation 


uy uy 
Au, =F “1s 3 ey 
ox oy 


(here putting, for brevity, Au = Ate +B aT -+C ). Let us suppose that we 
have solved this equation for uz and provided certain boundary conditions that, we 


suppose, completely determine an integral that we denote by u2. One then forms the 
equation 
ou 2 a uz 
Auw3,=F u2, >— 2% yY ’ 
x oy 


and solves it for uw; under the same boundary conditions as above, and continue in 
this fashion indefinitely. If the solution wu, tends to a definite limit u as n increases 
indefinitely on then obtains the solution u of Eq. (31.34) that satisfies the given 
conditions. 

These generalities only have interest when one can make the boundary conditions 
precise and put in place conditions that allow us to establish rigorously the conver- 
gence of u,, to the limit u; this is the point of this memoir. We make the essential 
supposition that in the region of the plane containing the point (x, y) the discriminant 
B? — AC does not change its sign. Consequently, we can reduce our equation to one 
of the two following types: 


a7u me a7u F( du ou ) (31.35) 
a a a, U,7—,77>%, ; : 
ax? dy? ax dy d 


92 du a 
ee | eee er (31.36) 
oxdy x dy 
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for which the problems posed are entirely different. 

For equations of the first form we provide as boundary conditions the values of 
the functions u,, along a closed contour C, and we require them to remain continuous 
along with their first two differentials inside the contour. The study of u,, shows that it 
converges to a limit provided that C satisfies certain conditions that, in particular, are 
satisfied when the contour bounds a sufficiently small area. In this case one obtains 
a solution of Eq. (31.35) that takes a given continuous succession of values on the 
contour. This solution, as we shall show, is moreover unique if the equation is linear, 
when the contour is sufficiently small. 

One cannot affirm in general that the solution is unique when F is not linear in 
Uu, a and £4 = 

In the ae of Eq. (31.36), the boundary conditions must be taken in an entirely 
different manner. Here we take an arc of a curve C, and along C we prescribe the 
values of on and a as well as the value of u, at a point A of C. Let B be a 
second point of C, such that the coordinates of a point M of the curve constantly 
vary in the same sense as M goes from A to B; consider the rectangle parallel to 
the axes of which A and B are opposite vertices. If B is sufficiently close to A, uy, 
will tend to a limit wu for all points of this rectangle, and one will have the solution 
u of Eq. (31.36) that takes a given value at A and for which oe and - take a given 
continuous succession of values on the arc AB; u and its two first partial derivatives 
are continuous functions of x and y as one traverses the arc AB. 

The theorems indicated above for Eq. (31.35) are only correct if the contour 
C encloses a sufficiently small area. It is very interesting to find equations where 
without restriction a solution that is continuous together with its partial derivatives 
will always be determined by its values on an arbitrary closed contour. 

One can give some detailed examples. This will happen for the Eq. (31.35) 


a7u 4s: a7u r du Ou ) 
A.D oy Uu, at 2, ’ 
ax? dy? ax dy = 


if, on replacing ou by v and ou > bY w one has the inequality 


an > (hy ey 
whatever u,v, w may be. In particular, if F depends on neither gu nor ou this 
condition will be satisfied. : 
We shall make a special study of the case where the equation can be written as 


au ii a7u F( ) 
at Ao = u,Xx, ’ 
ax2 ay? 4 
and F increases continually with u. 
Supposing first of all that F is always positive, we shall see what our method of 
successive approximations can give here. Its use leads to a very curious result. This 
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method leads not to one limit, but to two limits u and v. These functions take the 
given values on the contour and satisfy the two equations 


Au = F(v,x,y), Av=F(u,x,y). 


In order that the problem of finding a solution to the given equation and taking the 
given values on the contour can be solved, it is necessary that u = v; this will not 
be the case for an arbitrary contour, but this identity is verified if the contour is 
sufficiently small. 

In this particular case we shall show in what follows that one can pass to an 
arbitrary contour. In fact, the problem being treated for two contours having a part 
in common can be solved for the bounding contour exterior to the two areas. The 
alternating process, that M. Schwarz and M. Neumann used in their memorable 
works on the Laplace equation Au = 0, can, with modifications that are in any case 
quite obvious, be extended to our general equation, and, as a result, prove completely 
effective in the study of the integral, which moreover is unique, of the equation 


Au = Flu,x, y) 


that takes a given continuous succession of values on an arbitrarily closed contour. 
Talso consider an interesting case in which the function F’, which always increases 
with u, vanishes for u = 0. 
The solutions considered up to now are continuous inside the area. Taking in 
particular the equation 
au 07 u 
x2 + dy? = A(x, y)e", 


I examine the case where the integral has logarithmic singular points, and I partic- 


ularly direct my attention to the following equation which is of great interest, in 
geometry as much as in analysis, 


where k denotes a positive constant, and which one can call Liouville’s equation. I 
deepen the study of the solutions of this equation by considering them in the whole 
plane and by studying especially those which are continuous in the whole plane with 
the exception of a certain number of logarithmic singular points for which one regards 
the corresponding coefficients as given (only satisfying certain inequalities); I draw 
attention here to the following result: These solutions depend only on an arbitrary 
constant, and a solution of this kind is determined when its value is given at a point 
of the plane distinct from the singular points. 

Having studied these solutions in the ordinary plane, I extend this to the multiple 
plane, that is to say to the plane covered by a certain number of leaves forming a 
Riemann surface. 
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I remark, in ending this chapter, that the above results relative to the equation 
Au = ke" 


are not without interest for the theory of Fuchsian functions. I shall come back to this 
special application of the general theory that I have tried to develop in this memoir. 

In a final chapter I apply to ordinary differential equations the approximation 
methods that I have used. It is particularly interesting to consider a system of differ- 
ential equations of the form 


dy 

Pr = Ai, Yi, Y2,--65 Ym), 
d’ yy 

Te = fox, Yi, Y2,--05 Ym), 
d’y 

Pas = Fi (3 Vis Y2,-265 Ym)s 


and to obtain system of solutions for these equations continuous between x = a and 
x = b and taking given values at these extremes. 

The same considerations apply to a system of partial differential equations of the 
form 


07 uy a7 uy 
ax2 ay2 = fil®, y, U1, U2, ...,Um). 
a7 u2 0° uy 
ax2 jy2 = f2(X, y, U1, U2,--., Um). 
au a7u 
ar ay? es fin (X, y, Uj, U2,-.-,Um). 


By imposing certain very general hypotheses on the f one can determine a system 
of solutions uj, U2, ..., Um taking given values on a contour. [Extract ends.] 


31.8 Picard and Hyperbolic Partial Differential Equations 
(1890) 


Picard showed that these equations can be solved with initial conditions on suitable 
arcs. 
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Among the equations of the type to be considered, it will be enough to consider 
the equation 


az dz 4 pee ns 
=a CZ; 
oxdy ox dy : 


because once this equation has been dealt with it will be enough to repeat the same 
arguments as above to deal with the more general equation 


2 
ue =F (25. ny). 
oxdy ox dy 

There can be no question here of finding the solution in terms of its values along 
a closed contour; we have to study the method of successive approximations with a 
view to determining the general integral. 

Let us consider in the (x, y)-plane the arc of an arbitrary curve C for which we 
suppose only that either of the coordinates is a function of the other, and always 
increases in the same sense. We want to obtain a solution of this equation for which 
the partial derivatives ge and a take a given succession of values on C and which 
itself takes a specified value at A on C. 


One first tackles the problem for the equation 


a°z1 


axdy 


Let z; be a solution, one must then consider the equation 


4°z5 0Z1 OZ] 
= b : 
oxdy Ox r oy reg 


One looks for a solution on which oz and 42 vanish on C and 2» itself vanishes at 
A. : 
One then considers the equation in z3 


47z3 0Z2 0Z2 
= b 
Oxdy ox = oy ce 


which one solves under the same conditions, and continues in this way indefinitely. 
It is now necessary to study the series 


ateote tite (31.37) 


and to see if it gives a solution of the problem stated. [...] 
Let us remark right away that the solution of the equation 
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Fig. 31.2 The curve C and y 
the point P 


M 


07 zy os 
axdy 


is immediate if on C one gives os as a function of x and 2 as a function of y. Let 


y(x) and y(y) be these two functions. One will aon Have 
x y 
ait f owdr+ f wordy, 
Xo yo 


where A has coordinates (xo, yo). 
On the other hand, let the equation be 
az 
Oxdy 


= F(x, y), 


where F is a continuous ee of x and y. The solution of this equation that 
vanishes at A and for which 2 2 and % = vanish on C can be represented in the following 
way: let P be a point with eoordinates (x, y), and draw through this point parallels 
to the x and y axes meeting the curve C at points M and N: the required solution is 


given by the double integral 
- | [ Fe masan 


taken over the curvilinear triangle PMN. 

One assumes, as I have already said, that from A to B on the arc C either of the 
coordinates is a continuous function of the other and always varies in the same sense 
(increasing in the case of the figure). 

This done, suppose that the point P lies in the rectangle AB A’B’ [see Fig. 31.2], 
and let AB’ = aw and BB’ = B; weare going to look for upper bounds on the different 
terms of the series (31.37). 

Let us denote by M the maximum value of a= Set + be + cz, in the rectangle 
ABA’ B’. One then has 
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\Z2| < MaB, 
Ox 


If moreover the maximum absolute values of a, b, and c in the rectangle are A, B, 
and C, respectively, then in the rectangle 


as we tbe tex < M(Aa + bB + Cab). 


Consequently, 


023 


|z3| < M(Aa + bB + CaB)aB, < M(Aa + bB + CaBs)B 


Continuing thus, one arrives in general at 
IZn| < M(Aa + bB + CaB)""'aB. 


It follows that the terms of the series (31.37) can be compared with a geometric 
progression; if, therefore, 
Aa+bB + Cap <1 (31.38) 


then the series (31.37) and the two series 


dz, 022 0z3 
ox | Ox” Ox 
th 1 OS 
dy " ay © ay 
will converge. As for condition (31.38), it will evidently hold if the point B is suf- 


ficiently close to the point A. The series oe inside the rectangle ABA’ B’. 
The function z, the limit of the series z} + z2 +---+z,+--- will Goviously have 


922 
axdy? 


equation 


a7z dz 4 pee 4 
=a CZ. 
axdy Ox dy < 


Thus, under the above hypotheses, we have for the given partial differential equa- 
tion a solution z that takes a given value at the point A on the curve, and for which 
the partial derivatives ae and = take, respectively, on C a prescribed continuous 
succession of values. These functions g(x) and y(y) in our analysis are subject only 
to the single condition of being continuous. Let us remark that z, ae and a are 


continuous functions of x and y even when one crosses the arc C; here there is an 
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interesting point in the theory of partial differential equations that it good to insist 
upon. 

A solution z of a linear second-order partial differential equation is, one says in 
a general way, determined when one prescribes the values of z and a on a curve 
C, or, which comes to the same thing, the values of ge and i on this curve and the 
value of z at a particular point of C. But this general conception is only valuable for 
a curve C traced in a region of the plane where the characteristics are real, that is 
to say, only in this case, when one is certain to have a solution satisfying the given 
conditions that is continuous along with its first-order partial derivatives when one 
crosses the arc C; our preceding analysis shows very neatly that z, ae and a are 
continuous in the passage over C. 

It is quite otherwise when the characteristics are imaginary. To see this, it suffices 
to take the simple example of the equation 


il a ia é 
ax2 Bs dy2 

In general one cannot have a solution of this equation that is continuous in the 
rectangle ABA’ B’ along with its first-order partial derivatives, and for which ae and 
i take on the arc AB the succession of values denoted above by g(x) and w(y), these 
functions being subject to no other condition than being continuous. In the contrary 
case one could, in effect, form an analytic function z + iz; that will be holomorphic 
in the rectangle under consideration, the real part of this function being arbitrary on 
the curve AB, which is impossible because a holomorphic function determined on 
an arc of a curve however small can only be extended in a unique way. 

Thus the proof that we have given of the existence of a solution of the equation 


a7z dz 2 poz és 
=a CZ; 
aoxdy ox dy : 


and its development in series allows us to raise a question that we must necessarily 
put on one side, when one supposes that a, b, c are analytic functions and that the 
conditions on the bounds are expressed by means of analytic functions. 
The linear equation 
a7z az 
=a 
Oxdy ox 


Oz 
+b—+ cz 
dy 


has been the object of a remarkable chapter in Darboux’s Legons sur la théorie des 
surfaces (Vol. II, Chap. IV). Following an idea of Riemann’s, Darboux reduced the 
solution of this equation to the study of a particular solution z; this solution z is deter- 
mined by the condition that it reduces for x = x9 to a given function g(x). Darboux 
established the existence of such a solution in supposing that a, b, c are analytic func- 
tions of x and y, and he uses as an intermediary the celebrated equation considered 
by Euler and Poisson. Staying with our point of view of successive approximations, 
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the proof of the existence of such a solution z is quite easy, without making any 
other assumption about a,b,c, g, and y other than that of continuity. [Evidently 
one has (xo) = (yo), and one assumes that g and w have first derivatives.] [Here 
the extract ends. ] 

Picard then sketched the slight modifications of his earlier proof needed to adapt 
it to the new situation. 


Appendix A 
Newton’s Principia Mathematica 


Newton’s theory of celestial mechanics, set out in his Principia Mathematica 
(Fig. A.1) [206], is based on his analysis of the concept of motion and its causes, a 
thorough-going mathematical analysis of the motion of bodies under the action of 
forces, and a meticulous study of the observed motion of the Moon, the planets, and 
their satellites. This led him to proclaim his highly novel theory of gravity, and to 
refute the earlier and widely accepted ideas of Descartes’s. A notable success on the 
way was his integration of Kepler’s laws, and his study of the motion of a body under 
a central force. He came up with a remarkably accurate description of the motion 
of the planets and their satellites based on his inverse-square law of gravity, but it 
nonetheless failed to account for the motion of the moon, and this was a cause of 
great controversy the generation after his death. 


A.1 Newton’s Laws of Motion in His Principia 


Newton’s Principia was published in late 1687. It is a book of 547 pages, written in 
scholarly Latin, and after some introductory remarks and a few definitions it opens 
with these three laws of motion.! 


Law 1 Every body perseveres in its state of being at rest or of moving uniformly straight 
forward except insofar as it is compelled to change its state by forces impressed. Projectiles 
persevere in their motions, except insofar as they are retarded by the resistance of the air and 
are impelled downward by the force of gravity. A spinning hoop, which has parts that by 
their cohesion continually draw one another back from rectilinear motions, does not cease 
to rotate, except insofar as it is retarded by the air. And larger bodies — planets and comets 
— preserve for a longer time both their progressive and their circular motions, which take 
place in spaces having less resistance. 


'See Principia Axioms, or the laws of motion in the Cohen and Whitman translation of 1999, pp. 
416-417, and (in the Motte—Cajori translation) pp. xvii—xix and F&G 12. B2. 
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Fig. A.1 Title page of 
Newton’s Principia 


Law 2 A change in motion is proportional to the motive force impressed and takes place 
along the straight line in which that force is impressed. If some force generates any motion, 
twice the force will generate twice the motion, and three times the force will generate three 
times the motion, whether the force is impressed all at once or successively by degrees. And 
if the body was previously moving, the new motion (since motion is always in the same 
direction as the generative force) is added to the original motion if that motion was in the 
same direction or is subtracted from the original motion if it was in the opposite direction or, 
if it was in an oblique direction, is combined obliquely and compounded with it according 
to the directions of both motions. 


Law 3 To any action there is always an opposite and equal reaction; in other words, the 
actions of two bodies upon each other are always equal and always opposite in direction. 
Whatever presses or draws something else is pressed or drawn just as much by it. If anyone 
presses a stone with a finger, the finger is also pressed by the stone. If a horse draws a stone 
tied to a rope, the horse will (so to speak) also be drawn back equally toward the stone, for 
the rope, stretched out at both ends, will urge the horse toward the stone and the stone toward 
the horse by one and the same endeavor to go slack and will impede the forward motion of 
the one as much as it promotes the forward motion of the other. If some body impinging 
upon another body changes the motion of that body in any way by its own force, then, by 
the force of the other body (because of the equality of their mutual pressure), it also will in 
turn undergo the same change in its own motion in the opposite direction. By means of these 
actions, equal changes occur in the motions, not in the velocities — that is, of course, if the 
bodies are not impeded by anything else. For the changes in velocities that likewise occur in 
opposite directions are inversely proportional to the bodies because the motions are changed 
equally. This law is valid also for attractions, as will be proved in the next scholium. 


Newton’s laws of motion are stated as axioms, and accordingly neither derived 
from other statements nor based on experiments. Newton gave an explanation and 
elucidation of each law, but not a justification. And, as befits axioms, the laws are 
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the basis for subsequent deductions concerning the behaviour of moving bodies and 
bodies acted upon by forces. Newton’s laws state presumed properties of matter in 
motion; they are not specifically mathematical, being neither geometric nor algebraic. 
They are not stated in the form of equations involving symbols and their manipulation. 

In a book devoted to the study of motion in general and planetary motions in 
particular Newton had to decide what were the crucial astronomical ideas that he 
needed. He chose to rely on all three of Kepler’s laws, and Newton’s vast generalisa- 
tion of Kepler’s somewhat controversial second (equi-area) law, prominently placed 
near the front of the book, was to play a vital role in the theory he presented (see 
Sect. A.1.1). 

The Newton scholar I.B. Cohen has observed that’: 


It was an unusual and a very daring step to erect an astronomical system encompassing 
Kepler’s three laws, as Newton did. Following the imaginative leap forward that Newton 
made, in showing the physical meaning and conditions of mathematical generality or appli- 
cability of each of Kepler’s laws, this whole set of three laws gained a real status in exact 
science. 


A.l.1 The Content of the Principia 


Before Book I begins, the Principia has an introduction in which Newton spelled out 
the mathematically precise concepts used in his three laws of motion. He defined 
the quantity of matter and the quantity of force, and he discussed forces of various 
kinds. He then gave a complicated distinction between relative and absolute motion 
and relative and absolute time: in many ways Newton treated all motion as relative, 
but he also regarded the centre of the universe (which he regarded as the centre of the 
solar system) as being absolutely at rest. Only then come the three axioms or laws 
of motion we looked at above, and the first of their elementary consequences. 

This book then gives a long, careful, cumulative discussion of “the method of first 
and last ratios of quantities”: a geometrical study of curves and their tangents in the 
spirit in which Newton conducted his investigations of the calculus. 

Then Newton turned to a study of the motion of a point under a centripetal force. 
He showed that the line joining a fixed point to a moving one sweeps out equal areas 
in equal times if and only if the force on the moving point is directed towards the 
fixed point. Remarkably, the size of the force can depend in any way on the length 
of the radial line; the orbit can be any shape determined by the law, not just a circle 
or an ellipse. 

Among the special cases that are then worked out is this one: if the moving body 
traverses a conic section under a centripetal force directed towards one focus, then the 
magnitude of the force is inversely proportional to the square of the distance. Newton 
knew well, as his acceptance of Kepler’s laws indicates, that this is the relevant case 
in astronomy. In the first edition of the Principia he also stated the converse (it is 


2See Cohen ([46], 229). 
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Cor. | to Prop. 13): under an inverse-square law, bodies move in curves which are 
conic sections having the centre of force as one focus. Controversies surrounding 
this statement led Newton to enrich it with a skeleton proof in the second edition 
(1713). 

To find where in its orbit a planet can be found at any particular time Newton 
used Kepler’s equi-area law. He also showed that if planets traverse ellipses under 
the action of a force that obeys an inverse-square law, then they necessarily obey 
Kepler’s third law (the 3/2 power law, which says that if r is the radius of the orbit 
of a planet, and ¢ is the time to complete one orbit, then 2x r), 

Newton then investigated the attraction between solid bodies under an inverse- 
square law. He established that a spherical shell exerts no force on a point inside it and 
attracts a point outside it in the same way as a point mass concentrated at the centre. 
This remarkable result, which surprised him as much as his contemporaries, enabled 
him to reduce the study of large spherical objects like planets to the study of points 
and centripetal forces, which he had already described. In Newton’s theory of gravity, 
large solid spheres may be replaced by points (of the same mass)—a considerable 
simplification in the theory. For much work in astronomy, the assumption that planets 
and the sun are spherical in shape is entirely reasonable. 

Newton discussed many topics in Book II, subtitled “The Motion of Bodies (in 
resisting media’, but we need to note only that at the end of the book Newton 
demolished Descartes’s theory of motion in vortices and concluded: “Hence it is 
manifest that the planets are not carried round in corporeal vortices”.* 

In Book II, “The System of the World (in mathematical treatment)” Newton 
demonstrated that the theory of an inverse-square law for gravity acting as a force 
between bodies can explain the motion of the planets and of their satellites, the 
motion of comets, and enable the shape of the Earth to be determined. But the Moon 
gave him trouble and could deduce only that its motion obeys the equi-area law. 


A.2. The Motion of the Moon 


The motion of the Moon is far from simple, and it has been intensively studied not 
only because it is our nearest neighbour in space but because, if it could be understood 
accurately, it would provide an excellent clock and so be an aid to navigation. 

The problem is that the Moon is part of a system of three bodies: the Earth, the 
Moon, and the Sun, and while Newton could deal well with two bodies acting on each 
other by gravity, the three-body problem, as it became known, is (strictly speaking) 
unsolved to this day. No one can yet answer the question: given three arbitrary bodies 
acting on each other by gravity and released initially with such-and-such velocities, 
what will be their orbits for all future times? Will the Moon always orbit the Earth, 
move away, or eventually collide with it? We do not know. 


3See Newton, Principia, Book II, 790. 
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But if the mathematical problem is too difficult to solve exactly, exhaustive com- 
puter simulations enable scientists to navigate satellites around the solar system with 
astonishing accuracy and to make a variety of predictions about the long-term fate of 
the solar system (and, equally interesting, about its origins). These predictions say 
that the Moon will still be orbiting the Earth 50 million years from now. 

In the seventeenth and eighteenth centuries conclusions could only be reached 
if some simplifying assumptions are made. Newton assumed, for example, that the 
only effect of the Sun is to perturb slightly an otherwise elliptical orbit of the Moon 
around the Earth, and then tried to calculate that perturbation exactly (as measured 
by the motion of the apse of the ellipse). His success was less than complete, and 
because the Moon is an easy object to observe the mismatch between his, and indeed 
any, prediction and reality was apparent. 

Newton’s calculations in Principia I, 45 showed the elliptical orbit rotates slowly 
around the centre of gravity of the Earth and the Moon, returning to its original place 
every 18 years.* But, as Newton conceded in later editions of the Principia in a single 
crisp sentence: “The [advance of the] apsis of the Moon is about twice as swift”.° 
He was unable to come up with significantly more convincing and accurate theory, 
although unpublished papers show that he was able to get a better approximation to 
the Moon’s motion, and it says a lot about his high standards that he was displeased 
that his approximate theory was out by a factor of 2. 

But this point is not merely technical; its implications were profound. This failure 
of Newton’s, it was thought, might be the loose thread that would unravel his theory. 
And his theory of universal gravitation was unpopular in Cartesian circles, and ini- 
tially too difficult to understand in all circles. As a result, Newtonian gravity (more 
precisely, the inverse-square law) now came to stand or fall by its ability to describe 
sufficiently accurately the motion of the Moon. 

Significant progress on the question had to wait until 1747. In that year, Clairaut, 
wrote to Euler that it was “a proven fact that Newtonian gravitation is inadequate 
to account for the [lunar] phenomena”.° He therefore proposed to add a small 
inverse fourth-power term to the inverse-square law (making a law of the form 
f(r) =ar~* + br~*.) D’Alembert had come independently to the same opinion 
that Newton’s theory was incorrect, as did Euler, on the basis of his study of a dif- 
ferent three-body system (the Sun, Jupiter, and Saturn), although each man had a 
different remedy in mind.’ 

Euler had already submitted his essay on the motion of Saturn to the Académie des 
Sciences in Paris for consideration in their prize competition. In it, he expressed his 
doubts about the inverse-square law, particularly Newton’s failure with the motion 
of the apse of the Moon, and he hinted that he wished to re-introduce vortices (thus 
he framed what Newton would have called “an hypothesis”, an ad hoc mechanism). 
Clairaut, one of the judges, read the essay in September 1747, recognised Euler’s 


4See Newton Principia I, Sect. 9, 534-545. 

5See the Cohen and Whitman translation, p. 545. 

®For this exchange of letters, see Euler Opera Omnia (4A) 5, 173-175 and F&G 14.B2 and 14.B4. 
7See Wilson [274] for a good account, and for references to the primary literature. 
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handwriting, and wrote to Euler on 11 September to say that he was delighted that 
Euler had thought about Newtonian attraction. Clairaut went on 


It is true that on adding some other term one feels that the theory will better accord with the 
phenomena. But it seems to me that this term must be such that at the distances of Mercury, 
Venus, the Earth and Mars it must be almost insensible, in view of the extreme smallness of 
the motion of the apsides. And if, as it seems initially from your work, the law of squares is 
palpably in error at the distance of Saturn and Jupiter it would still be necessary to add terms 
which were significant only at that distance. I confess that the whole of gravitation seems to 
me to be only a speculative hypothesis. 


He then remarked that 


It seems to me, and I am not a candidate for the prize, much more important to know if 
Newtonian attraction holds or not than to treat simply of Saturn. And in seeing if the square 
law of attraction must suffer some correction which can only be for small distances it seems 
to me to be necessary to begin by finishing the theory of the moon. 


However, Clairaut soon withdrew the suggestion that the modifying term should 
be an inverse-fourth power, because it predicted that objects near the surface of the 
Earth should be heavier than they are. He also rejected Euler’s vortices, which he 
thought Euler himself had shown to be no help at all.® 

In Clairaut’s view, part of the problem was that Newton’s Principia was difficult 
to understand.° He praised it, which was still a controversial thing to do in France, 
by saying 

The famous book The Mathematical Principles of Natural Philosophy has been the occasion 

of a great revolution in Physics. The method which Mr Newton, its illustrious author, has 


followed to derive facts from their causes, has shed the light of mathematics on a science 
which up till then had been in the shadows of conjectures and hypotheses. 


and then he turned to say what had to be done next. The problem was not that Newton 
concealed his fluxional calculus that was easy to supply. Rather, 


is it not right to reproach him for another wrong which without doubt has struck all those 
who have studied his book with a true desire to understand it? Namely, that in most of the 
difficult places he employed too few words to explain his principles [...]. 


That said, Clairaut reflected, so much else was right—‘“Kepler’s laws [...], the move- 
ment of the nodes of the moon [...], the tides, [...] and finally several other questions 
equally favourable to attraction [that] it appeared to me as difficult to reject as to 
accept”. 

Clairaut began to work intensively on the law of gravitational attraction, and on 
17 May 1749 he announced his surprising conclusion that, by taking a new point of 
view, he had found that the problem disappeared, and the inverse-square law could 
give the correct prediction for the apse line of the Moon. 

Euler was not immediately convinced, and in 1749 he persuaded the St Petersburg 
Academy to have a prize competition, and suggested several propitious topics, all 


8 See Euler Opera Omnia (4) 5, letter 421 and F&G 14.B2(b). 
°See Clairaut [45] and the extract in F&G 14.B3. 
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of an astronomical nature. They chose this one “To demonstrate whether all the 
inequalities observed in lunar motion are in accordance with Newtonian theory— 
and if they are not, to demonstrate the true theory behind all these inequalities, such 
that the exact position of the Moon at any time can be computed by means of it”, 
and Euler became one of the judges, indeed the decisive one.!° 

Clairaut hesitated over whether to enter the competition. He published his paper, 
which d’ Alembert criticised, and only submitted his entry in December 1750. The 
Academy had by then decided to extend the competition to June 1, 1751, but when 
they sent Euler Clairaut’s entry he replied that it “is superb, and it is hardly likely 
that anything better will be received prior to June 1”. He repeated his endorsement, 
and admitted that he had changed his own opinion, in the official statement he wrote 
on 5 June, and the result was announced on 6 September 1751.!! 

In his paper of 1749, and again in 1753, Clairaut argued that the error lay in the 
poor way in which exact, unsolvable equations for the motion of the Moon had been 
reduced to inexact, approximate, but solvable equations. 

Clairaut formulated the problem of the motion of the Moon in terms of differential 
equations, and after integrating twice found this expression for the solution, in which 
Q is an unknown function of r, the radial distance of the Moon, and of the perturbative 
force of the Sun: 


f 
Mr 


= 1-gsinv cos +siny [ Qeos vdv—cosv f Gsinvde. (A.1) 


Here, f, g, and qg are constants of integration, M is the sum of the masses of the 
Earth and the Moon, v is an astronomical quantity called the true anomaly (which 
may be taken to represent the velocity of the Moon). 

To find Q, Clairaut employed a process of successive approximations. The apse of 
the Moon was understood to move rather as if the Moon precesses on an ellipse. So 
Clairaut, following Newton, first wrote the equation of an ellipse in polar coordinates 


k 
— = 1-—ecosmv. (A.2) 
7 


Here, k, e, and m are constants that are either to be determined from the constants 
f.g, and g or otherwise found from observation. In particular, e was already known 
empirically to be about 0.05. This means that as cos mv varies between at most +1 
and —1,r varies between -Ek. 

Clairaut substituted this approximation into his original equation, and obtained 
this better approximation to r: 


k 2v 2 2 
—=1-—ecosmv+ Bcos—-+ycos{—-—m]v+d|[—+m)v. (A.3) 
ia n n n 


See Kopelevich [163]. 


'Clairaut then published his own theory of the motion of the Moon in 1753. D’ Alembert now also 
withdrew his criticisms of the inverse-square law. 
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Here n is another quantity determined by observations (and therefore known) and 
G3, y, and 6 are constants determined from the other constants so far; the new terms 
describe the way the ellipse slowly changes. 

Clairaut now explained his original mistake, which had led him to deny the inverse- 
square law. He evaluated (3, y, and 6 and found that to nine decimal places 


2B = 0.007090988, 7 = —0.00949705, and 6 = 0.00018361 . 


These numbers are much smaller than e, and he felt that they were already too small 
to allow his method to double the value of m, so he did not seek a better, second 
approximation. Accordingly, he had been inclined to believe that the error must lie 
in the inverse-square law, which consequently needed amending. But in the spring 
of 1749 he calculated the next approximation and found that his hunch had been 
wrong. It turned out that the contributions coming from the y term were not only 
quite large, they were proportional to the transverse perturbing force, whereas the 
initial contribution to m related only to the radial perturbing force. It was only by 
going to the second approximation that Clairaut could pick up the effect that was 
making the Moon’s ellipse precess. Now, on calculating these numbers, Clairaut 
found that the monthly apsidal motion was 3°2’6”, which was just 2’ less than the 
empirical value that he accepted. 

It must be said that even Euler found Clairaut’s method difficult to follow in 
detail.!* But in the end Newtonianism became accepted very much as Newton had 
presented it, in that 


1. it was a highly mathematical theory of the solar system; 

2. the predictions that it made rested on a highly theoretical analysis; 

3. if its conclusions were accepted then its theoretical presuppositions seemed 
inevitable, provided the mysterious force of gravity was accepted as really exist- 
ing. 

This vindication of Newton’s theoretical approach gave mathematicians the con- 
fidence to deal for the first time with many more of the most interesting aspects of 
the physical world. 


!2See Euler Opera Omnia (4) V, 195-196 and F&G 14.B4. 


Appendix B 
Characteristics 


B.1 ‘First-Order Linear Partial Differential Equations 


The simplest partial differential equation in two variables x and y that one could 
hope to solve is u, = 0—which is essentially an ordinary differential equation— 
and its solution is u(x, y) = f(y), where f is any function of y. Notice, crucially, 
that any solution is constant along the curves y = const., and also that it can vary 
arbitrarily—not necessarily even continuously, from curve to curve. 

A modern approach to linear partial differential equations in two variables aims 
to reduce them to this form, and so reduce the problem to one in ordinary differential 
equations.!? 

Consider the equation 

au, + buy = 0, 


where a and b are constants, not both zero. One solution method notices that the 
equation says that the directional derivative of the function u vanishes in the direction 
(a,b) and so any solution u is constant along lines with equations of the form 
bx — ay = c, and so the solution to the partial differential equation is 


u(x, y) = f(bx — ay), 


which is constant along the lines bx — ay = c. 
A second solution method changes variables to 


€=ax+by, n=bx -ay, 
observes that as a result 
'3See Grigoryan’s account: http://www.math.ucsb.edu/~grigoryan/124A.pdf or google Grigoryan 
partial differential equations. 
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Ux = aug + buy, Uy = bug — au, 
and so the partial differential equation becomes 
(a? + b’)ug = 0. 
This gives the solution 


u(€,n) = fi) = f(bx — ay), 


as before. 

Once again the solution is constant along a family of curves—called the char- 
acteristics of the partial differential equation—and is therefore determined by the 
values specified on any curves transversal to the characteristics. 

The method of characteristics applies to linear first-order partial differential equa- 
tions with variable coefficients: 


a(x, y)uy + D(X, y)uy = 0. 
The characteristics are now curves along which the directional derivative of u van- 
ishes. 
Grigoryan gives this example 


ux, + yuy = 0. 


We want the directional derivative in the direction (1, y) to vanish, that is, along 
curves for which 


ay. 
dx 1’ 


and these are the curves with equations y = Ce*, where C = ye“ is a parameter 
that varies from curve to curve. As before, the solution to the partial differential 
equation is constant along these curves, so it is given by 


u(x, y) = f(ye*). 
Or we could argue that if u(x, y) = const. then we always have 
u,dx + uydy = 0, 


and so 


which leads to the same conclusion. 
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As before, we can also solve the equation 
a(x, yu, + d(x, yuy = 0, 


by changing variables to 
€=€, y), n=, y). 


We have 
Ux = Ue€y + Unty, Uy = Ue€y + Untly, 


and so the partial differential equation becomes 
(ag. ot bE )ug + (any + by Ur = 0. 


If we set an, + bry = 0 then the partial differential equation becomes the ordinary 
differential equation u¢ = 0, which we solve. The solutions are of the form u(€, 7) = 
f(y), and once again they are constant along the curves 7) = const. 

So we now have to solve the equation an, + bry = 0, and (mimicking an earlier 
argument) we deduce from u(x, y) = 0 and u,dx + uydy = 0 that 


ux dy _b 


uy dx a’ 
as before. 
Indeed, the solution to the ordinary differential equation 


dy _ b&,y) 


dx  a(x,y) 


is given by an equation of the form f(x, y, c) = 0. This can be written locally, if 
we allow ourselves to share the confidence of the eighteenth century authors, as 


y=f(x)+e. 
Define 7 = y — f(x) and € = x, so 


d 
Ux = Ug + Uy (-4) » Uy = Uny, 
dx 


and in the new variables the partial differential equation becomes 


or 
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the solutions of which are 


u(E, 7) = a(n), 


and these values can be prescribed arbitrarily along the curve 7 = constant. 

To sum up the story so far, the first step in solving a linear first-order partial 
differential equation is to change variables, so almost automatically we use the 
formulae 


uy = ucky + Untxs Uy = ucky + UnTy- 


The second step is to change variables in such a way that one term in the partial 
differential equation vanishes. This requires solving an ordinary differential equation 
for the characteristic curves. The third step observes that the solutions are constant 
along the characteristics, and so (fourth and final step) these values are picked up 
from a given set of initial conditions that specify the values of a solution along a 
curve transversal to all the characteristics. 


B.2. Burgers’ Equation, a Non-linear Equation 


Burgers’ equation, 
u; + uuy = 0, 


is anon-linear equation, and it displays interesting phenomena. '* The characteristics 
are straight lines of varying slopes, and the solution is constant along the characteris- 
tics. But now suppose, for example, that the slopes of all the lines through the x-axis 
for negative x are steep and positive, and that the slopes of all the lines through the 
x-axis for positive x are shallow and positive. Then none of the first kind of charac- 
teristic lines meet any of the second kind, and as ¢ increases a gap opens up between 
the first kind and the second kind. This is called rarefaction. 

It is more interesting if, instead, the slopes of all the lines through the x-axis for 
negative x are shallow and positive, and that the slopes of all the lines through the 
x-axis for positive x are steep and positive. Then all of the first kind of characteristic 
lines meet characteristic lines of the second kind, and as f increases it becomes 
impossible to say what happens when the characteristics cross (because the solution 
cannot have two different values). This is the phenomenon of a shockwave. 

We are given the quasi-linear equation 


u, ta(u)u, =0 
with initial values on the line t = 0. 


The characteristics in this case are the solutions of the ordinary differential equa- 
tion 


'4See, for example, Grigoryan’s account. 


Appendix B: Characteristics 361 


a a(u). 
The theory of characteristics says that every solution u remains constant on each char- 
acteristic, and so the slope of each characteristic is constant and the characteristics 
are straight lines. 

There is a characteristic for each point (x;, 0) on the line t = 0 and its slope is 
given by the value of u at that point. So if the values of u at (x1, 0) and (x2, 0), with 
X1 < X2,areu,; and uz and0 < a(u2) < a(u;) then the characteristic through (x1, 0) 
has a lesser slope than the characteristic through (x2, 0) and will eventually cross it. 
This shows that the solution cannot be continued beyond that point. 

Courant and Hilbert (Vol. 2, Appendix 2 to Chap. IT) also pointed out that the partial 
differential equation with initial values given by u(x, 0) = v(x) has a solution in 
the form 

u — p(x —ta(u)) = 0, 


and the implicit function theorem implies that u is a differentiable function of x 
and t as long as the u derivative of —y(x + ta(u)) does not vanish, a condition that 
holds whenever ta'(u)y 4 1. Whenever this condition is violated one can expect u 
to become singular. 

Another good discussion of Burgers’ equation 


u; +uu, = 0 


is given in Evans ([{100], 140-144). He takes initial data on the axis t = 0 that is 
given by a function u = g(x) that is defined as 


1 ifx <0 
g(x) =41-xifO0<x<1 
0 ifx > 1. 


The characteristic through x9 is x(t) = g(xo)t + xo, t => 0 and along it any smooth 
solution takes the constant value zo = g(xo). It is instructive to plot some of these 
before proceeding. Accordingly, the solution function is 


1 ifx <t O<t<l 
—iftsx<10<r<1 


g(x) = 
0 ifx >1 0<t<l. 
The method breaks down when t > 1—note that u is apparently infinite when ft = 


1—and in this case the characteristics cross. The most visible case is the way the 
characteristics through points x9 < 0 meet the ones through points xo > 1. 


Appendix C 


The First-Order Non-linear Partial Differential 
Equation 


For reference, here is a statement of the existence and uniqueness theorem for the 
first-order partial differential equation 


F(x, y,z, p,q) =0 


(from John ([153], 29)). 

The function F' has continuous second derivatives; 

Along an initial curve (xo(s), yo(s)) initial values z9(s) are assigned, and x9, yo, Zo 
have continuous second derivatives; 

There are two continuously differentiable functions po(s) and go(s) such that 


F(xo(s), yo(s), zo(s), Pols), go(s)) = 0 


and 
dZo dxo dyo 


ds 


The transversality condition: 


dxo 


dyo 
— F; (x0, yo. Z0, Po, Jo) — ——F (Xo, Yo, Z0, Po, Jo) # 9. 
ds ds 


Then in some neighbourhood of the initial curve there exists a unique solution that 
contains the initial strip, i.e. 


Z(xo(8), yo(s)) = Zo(S), Zx(Xo(S), Yo(S)) = Pols), Zy(xo(s), yo(s)) = go(s). 


We shall now see that the equation has a unique solution. 
Solutions of the general first-order partial differential equation 


F(x, y,Z, p,q) = 9, 
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with whatever conditions on F that seem necessary will form a two-parameter family 
of surfaces together with whatever envelopes one-parameter families may form and 
any envelopes the two-parameter family has (there may be none). 

We now make a standard move in the subject and attempt to accumulate so many 
necessary conditions on any candidate for a solution to the partial differential equation 
that together they form a set of sufficient conditions that enable the problem to be 
solved. 

Fix attention on one of these surfaces and suppose it passes through the point 
Po = (Xo, yo, Zo). The equation 


F (x0, Yo. Z0, P» g) = 0 (C.1) 


is an equation relating p and q at Po, and we can think of it as defining g as a function 
of p: g = q(Xo, yo, Zo, Pp). SO we have a one-parameter family of planes through Po 
with equations: 

Z— 20 = (X — Xo)p + (y — yo)a. (C.2) 


These planes envelope a cone through Pp that is tangent to the surface. To find this 
envelope, we differentiate the above equation with respect to p and obtain 


d 
0=x-mt0-w)e. (C3) 
dp 
From the first equation we obtain 
d 
+ F,— =0, (C.4) 
P 


and so we may write 


= : (C.5) 


and therefore 
X7~X%0 YT 47 %0 


Fp Fg PF yp + qQFg 


(C.6) 


From Egg. (C.2) and (C.5) we deduce that these values of p and qg define a tangent 
line in the cone that is tangent to the surface and is given by Eq. (C.6). 

The tangent line is a tangent to the characteristic curve in the surface through Po, 
which therefore satisfies the equations 


(C.7) 


which we can regard as 
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dx dy dz 
—=F,, —=F,, —=pF,+qF,. C8 
Gi de ap ae ee ee 
This is not enough information to determine the surface. But we also know that 
any characteristic curve in the surface (regarded as a curve parameterised by f) must 
satisfy the equations 


dp = dx Ea dy = pF 4 pF 

Ai = Pri Py = Pxl'p 7 Py! q 
and 

fg ig Sp eae 

dt ee dt dy dt a a aa 


We can also differentiate the equation F(x, y, z, p,g) = 0 with respect x and 
with respect to y and obtain 


Fy + F,p + Fypx + Fog, =90 and Fy+ F.g+ Fypy t+ Faqy = 9. 


Therefore 
dq 


dp = 
dt 


dt = —F, — F-p, 


- = —F, — F.q. 

Now we have five equations for the functions that define the characteristic curves 
and fill out the solution surface, x(t), y(t), z(t), p(t), g(t). This is too many, but 
the condition F(x, y, z, p,q) = 0 implies that ie = 0, so it merely says that F is 
constant along any of these curves, which is as it should be. 

It remains to check that these characteristic curves define a surface that is a solution 
to the partial differential equation, and that a surface can be found passing through any 
initial curve (xo(s), yo(s), Zo(s)), O < s < 1 that is not a characteristic curve. The 
dx dy dz dp dq 
dt’ dt’ dt’ dt’ dt 
suppose that when t = 0 x = xo(s), y = yo(s), Z = Zo(S), D = Pols), g = Go(s)— 
the first three of these equations place (x0, yo, Zo) on the initial curve. We appeal to 
the existence of solutions to an ordinary differential equation to deduce that in the 
neighbourhood of the initial curve there are functions such that 


five equations for are all ordinary differential equations. We 


x= X(s,t),y=Y(s,t),z = Z(s,t), p = Pls, t),g = Qs, t) 


and 


X(s,0) = xo(s), Y(s, 0) = yo(s), Z(s, 0) = zo(s), P(s, 9) = pols), Q(s, 0) = gots). 


It remains to check that these curves define a surface z = z(x, y) and that this surface 
is a solution of the partial differential equation. To eliminate s and ¢ so that we can 
write 
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2 = Z(s,t) = Z(s(x, y), t, y)) = 2, y) 


(if you forgive the ambiguous notation for a moment), requires that the Jacobian 


( y. 7) be invertible when t = 0, which it is because its determinant a F, — 
Ss t : 


dyo 


rE F,, is non-zero. 


Finally, to see that z = z(x, y) is a solution of the partial differential equation we 
check that 


F(x, y, 2(X, y), Ze (%, Y), Zy(X, y)) = 9, 


which involves checking only that p = x, and q = 2y. 
However, the argument to establish this conclusion is somewhat roundabout. One 
defines 
U=% — PXr— 4y1, 


V = Zs — pXs — qys- 


If we see these equations as equations for p and qg, we can deduce that 


Xt Yt P\_ (a7 U 
Xs Ws} \q et ae 
We can solve these equations provided that x,y, — xs, 4 0, which we have already 


observed above. 
Then we treat the equations 


O = 2 — 2X — Zy Ms 
0= Zs — £xXs — Zy ss 


in the same way, and deduce that 


Xt Yt Ux _ Zt 
Xs Vs Uy Zs J 
Now we can see that our remaining problem is solvedif U = Oand V = 0. Happily 


for us, V = 0 is a consequence of the characteristic equations. The requirement on 
U is more difficult. We regard U and V both as functions of s and t. Then we 


calculate “ - a and use the fact that V = 0 to deduce that a = 0 to obtain 
KY 


U 
an expression for Dae Then, from the equation F = 0 we deduce that 
s 

OU _ 


—=-_F,U. 
Os 
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We treat this as an ordinary differential equation for U as a function of u for each 
fixed ft and solve it to obtain 


U(s) = U(O)efo Fes, 
But U(O) = 0 and so U(s) = 0 for all s. This concludes the proof. 


Note also that nothing more has been required of these functions than that they are 
at least two times continuously differentiable with respect to the obvious variables. 


Appendix D 
Green’s Theorem and Heat Conduction 


Green’s theorem proved endlessly useful in the study of the partial differential equa- 
tions of mathematical physics, as the following examples illustrate. 


D.1_ Explicit Representations 


To proceed further, we come down to two dimensions. The same process applies as 
it did in three dimensions, except that v is to be infinite like x Inr at a point of D. 
If the Dirichlet problem is posed for a two-dimensional region D with boundary C 
then the relation between the Dirichlet problem and Green’s problem is given by 


u(P) = / f(.Vv) 
Cc 


for a suitable function v. 

The Dirichlet problem for the unit disc asks for a harmonic function u on the disc 
that agrees with a given function f on the unit circle. If we take the approach of 
finding a Green’s function then we have to find a function v(x, y). 

It helps to be clear about domains and coordinates. We define v on D x D and 
write v(x, y, €, 7). We then define 


1 
v(x, 9 &m) = = In ((@- 6 + @—))"") + AG, y, Em), 


Tv 


for some suitable function h(x, y, €, 7) that we have to find, where we require that 
V-h=0 on D, 

and, for v to vanish when €, 7 € C that 
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1 
hex. y. 6m) =—s—In((@- 2 +@— my") Emec. 


So we have set v(x, y) = + Inr + h(x, y), where h is to be harmonic everywhere 
in the disc and has the prescribed behaviour on the boundary that h(x, y, €,7) = 
os Inr, where r is the distance from (x, y) to (€, 7), when (€, 7) € C. 

If we let D be the upper half-plane, so the boundary C is the real axis, then we 
define h(x, y, €, 7) by looking at the mirror image of the point (x, y) in the real axis, 
which is the point (x, —y). We now define 


1 
h(x, y, €, ”) = —5c inn, 


where r’ is the distance from (x, —y) to (€, 7). This function fails to be harmonic 
only at (x, —y) which is not in D, and it is equal to = Inr on the boundary, as 
required. 

If we let D be the unit disc and take v = Inr. The unit normal vector at a point 


(x, y) on the unit circle is (x, y). The above argument says we have 


u(0) = fn.Vv. 


[z|=1 


Here 1 
Vu = —(O, logr, 0, logr). 
2 


We find 


Or OQ? +y*)'? _ x 
O, logr = = = 
r 


r op? 
fot y 
and similarly 0, logr = Fj» SO 

é r 


MN oo 
pe 


x 
n.Vvu = (x, YG) 


and therefore 
u(0) = fz. 


[z|=1 


This expresses the averaging property of harmonic function on a disc: its value at the 
centre is the average of its values on the boundary circle. 

What happens if we move the point where Green’s function becomes infinite to 
somewhere off-centre but still in the disc, say z = a, |a| < 1? There is a Mébius 
transformation that maps the unit circle to itself and the point a to the origin, 
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Fig. D.1 Finding Green’s 
function at an off-centre 
point 


az—b 
10 Tita 


where a = b/a. The corresponding Green’s function is 


1 
—1 


n 
27 


az—b 
—bz+4a 


There is a classical argument called the method of images that finds the harmonic 
function explicitly from its boundary values when D is the unit disc. We move the 
infinite point to P = (x, y) or (p, 9) in polar coordinates, define P’ = (p’, 0) where 
pp’ = 1—the point P’ is called the image of P because it is obtained by inverting P 
in the unit circle (Fig. D.1). 

Now we look at the point Q with (Cartesian) coordinates (€, 77) and polar coor- 
dinates (/, 6). We let O denote the origin, and let r denote the distance PQ and r’ 
denote the distance P’ Q. Then by applying the cosine rule first to triangle O PQ and 
then to triangle O P’Q we find that 


r= p + p* — 2ppcos(4 — 6), 
r? =p + p” —2p'pcos(6 — 8). 
When Q is on the boundary of the circle, so |O Q| = p = 1, we find that 


2 1+p’?—2pcos-0) 4 


pe fa 1/p” — 2/pcos(6 — 6) am 


So the Green’s function for the Laplacian is now 


1 r p+ p? — 2ppcos(6 — 8) 
v@,y)=7—n—=—, — = ; 
4n rp pp +1 —2ppcos(O — A) 
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In this case, 


nY Ov 
.Vvv=> >> 
Op 
evaluated at » = 1, which is 
1 1-p 


27 1 + p? — 2pcos(6 — 6) 


This gives the Poisson integral formula for the disc: 


0.9= 5 [1 ___ pai 
u(p, 0) = = . 
2m Jo 1+ p? —2pcos(@ — 8) 


Note that if p = 0 this collapses to the averaging result that we had before, as it 
should. 


D.1.1  Adjoint Equations 


A later generalisation of Green’s argument became known as the method of adjoint 
equations. The idea of introducing the adjoint equation of a given ordinary or partial 
differential equation is to make the original equation easier to solve, and as was 
briefly mentioned in Sect. 1.5 it was introduced by Lagrange in the context of ordinary 
differential equations. 

We write, following Sommerfeld ([248], Sect. 10) 


Ou Ou Ou Ou 
L =A 2B C D 
w) Ox? ™ OxdOy ™ Oy? . 


+E +F 0 
Ou u=0, 
Ox Oy 


where A, B, C, D, E, F are sufficiently differentiable functions of x and y. 
The adjoint equation to L(u) = 0 will be another second-order partial differential 
equation, M(v) = 0 such that 


OX OY 
vL(u) —uM(v) = Ox + ce 


for two functions X (x, y) and Y (x, y) that have also to be found. 
It turns out that 


MW) = OP Av 47 oBv ie Cv pe ge? teesh 
"Ox? OxOy Oy? Ox Oy — 


and 
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ant Ou a) +8 Ou a) +(D OA =) 
= v— —u— v— —u— —-.~_-—}, 
Ox "Ox Oy Oy Ox Oy 


Y=B ye Ov -C poe Ov ale OB OC 

= — -—u— — -u— —-.-—}. 
Ox Ox Oy Oy Ox Oy 

In the last two equations, we can replace (x, A, D) by (y, C, E). 


You can check that L(v) is the adjoint of M(u), and that L(u) = M(u) — when 
the equation L(u) = 0 is said to be self-adjoint—if and only if 


OE eet op Gi 2 es 
Ox Oy a) 


In particular, if the coefficients are constant and D = E = 0 the equation is self- 
adjoint. 

It is true, but will not be proved here, that if a second-order linear partial differential 
equation arises as an Euler-Lagrange equation then it is self-adjoint. It is also true 
that a second-order linear partial differential equation with constant coefficients can 
be made self-adjoint by multiplying it by a factor of the form exp()> Ax + joy) unless 
it is the heat equation. 

Let us now integrate a + on over a region S with area element do bounded by 
a simple closed curve C with line element ds. The theorem of Gauss and Green says 


h 
a OX OY 
[or —uM(v))do = / (> + >) do = [& Y).nds. 
S s \ Ox Oy C 


In the elliptic case in normal form, where A = C = 1, B = 0, the RHS becomes 


[o. E).n. 
Cc 


This generalises the usual Green’s theorem of potential theory (the case where also 
D = E = 0), which says that 


OX OY 
[oaw —uA(v))Do = / (> + -.) do, 
Ss Ss Ox Oy 


Awa 2 Ou 


where 


The importance of the adjoint equation arises from the fact that if wu and v are such 
that 
L(u) = 0, M(v) =0 


then the LHS’s above vanish, and the corresponding equations become 


374 Appendix D: Green’s Theorem and Heat Conduction 


i (X, ¥).nds = 0, 
Cc 


OX OY 
/ —+— do = | (x. Y).nds = 0. 
s \ Ox Oy C 


This holds provided that u, v, and their derivatives are continuous throughout the 
region S. If v, say, is discontinuous at a point Q = (€, 7) in S then it is excluded by 
drawing an arbitrarily small contour K around it, and taking the integral over both s 
in the positive direction and K in the negative direction. 

The most important single case is where the discontinuity of v at Q represents 
a point source of unit strength, one for which the yield q is the gradient of v. This 
means that at a radial distance p from Q 


Ov 
= —ds. 
I Op 


If, very close to Q, v depends only on p then 


and 


So 


1 
— = =—, v= —logp+const. 
27 


asp —> 0. 
So we have 


v=Ulogp+V, p=Va—-&2+0—-?, 


where U and V are analytic functions of (x, y) and (€, 7) such that U > + as 
(x, y) > (€, 7). 

This function v is called a Green’s function or a principal solution of the differ- 
ential equation M(v) = 0. Similarly, the function u is called a Green’s function or a 
principal solution of the differential equation L(u) = 0. The functions U and V will 
be analytic if D, E, and F are analytic. 

Next, an account of how this comes about in the context of heat conduction, 
following (Sommerfeld [248], Sect. 12). 

The partial differential equation for heat conduction is (with y = kt) 


Ou du 
L — 
w) Ox? = Oy : 


It is not self-adjoint; the adjoint equation is 
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Wes eG 
v=a5—+>—H= 
Ox? Oy 
Furthermore, 
X sce : and Y v 
=v —-u—, =u 
Ox 0. 


Because x is a space variable but y is a time variable, we choose to consider this 
only for simple regions bounded by sides parallel to the x or y axes. For such regions, 
along a side AB that is parallel to the x-axis 


ds = dx anddn = —dy; cos(n, x) = 0, cos(n, y) = —1, 


and so 


B B 
/ (X cos(n, x) + Y cos(n, y))ds = -{ Ydx. 
A A 


Similarly, along a side CD parallel to the y-axis we obtain 


D Cc 
i (X cos(n, x) + Y cos(n, y))ds = if Xdy. 
C B 


The general form of Green’s theorem then says 


foto — uM (v))dxdy = [vas +f (5 _ 9) dy. 
Ox Ox 


We apply this to a heat conductor infinite in both directions and for which the 
temperature at time f = 0 is given as u = f(x). The Fourier integral representation 
of the function f(x) is 


ee = / . ( / . fees His: 


To obtain a solution to the heat equation, we multiply the exponential term by a 
function y(y) and plug this expression into the equation. We find that we require 


dp 
rar 2 = — 
w p(y) ae 
SO 7 
EOy= Ce Y >, 


As we require (0) = 1, we obtain the solution to the heat equation in the form 
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u(x,t) = ~ im ([. feo ra) dw. 


Curiously, the resulting integral converges for a wider class of functions than does 
the original representation of the function f, and moreover the order of integration 
is now reversible. 


We now write y = kt and obtain the solution in the form 


u(x,t) = =f. (i oe ou) dw. 


The exponent is of the form —aw” + 3w, so we complete the square 


& 
4a" 


8 2 
—aw’? + Bw = a(u ) + 
2a 
Set p = w — @/2a, and then 


2 
Ae = ew kytiwe-O day = ee (x = §) i e°? dp. 
QT J 66 20 Akt = 


(oe) 


Write the RHS as U for the moment. 


There is a Laplace transform that was well known in the nineteenth century (and 
doubtless still is) that says 


oP dp =a, 
so a 
/ e oP dp — ie 
ore a 
and 
oe ee (= =9") 
~ anki > \ ake 
As t > 0 we have u(x,t) > f(x), so 
soy=f seQuas, 
and 


X+E 
/ Udé = 1. 


For a source of heat concentrated at the origin we have 
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2a == (D.1) 
ae a Ge) : 


D.1.2. Boundary Value Problems 


We shall suppose that the heat conductor is infinite in both directions and can be 
represented by the real line.!° We write 


1 —(x — 
JArkt oC ( Akt ) ; 


We shall consider two separate boundary conditions. The isothermal one for a tem- 
perature distribution u(0, tf), where we impose u = 0; and the adiabatic one for a 
given heat flow G(0, t), where we impose Ou/Ox = 0. 

Both these conditions are satisfied if the function f, which is given only for 
0 < x < ©, is extended to the negative real function either as an even or an odd 
function, and so by a pure sine or a pure cosine integral. This yields 


u(x,t) = [ f(Q)UdE, where U = 


uot = f f(G)U (E)d& +f f(Q)U (—€)bdé. 
0 C 


The principal solution U(€) becomes 


—G@ +6" 
U(—&) = ——— ex (— 
> ames a 
and describes a point source of heat at x = —€, t = 0. 


So we have 


sear [ FOGEdé, where G6) =U) FU-O. 


The function G is called a Green’s function. It has only one pole (or heat source) in 
the interval 0 < x < ov, and it satisfies the adjoint equation as a function of € and T 
because it is independent of 7 and so the change of sign with respect to the 7 variable 
is irrelevant. 

The initial conditions are now to be represented by not a single point source but a 
continuum of them—first sum over a finite number of point sources of heat and then 
pass to an integral. They are placed at every 7 < —€. 

The corresponding function G is given by 


15See Sommerfeld ([248], 66). 
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=§ 
G = U(6) + AU(-8) + / a(nU (nan, 


—oO 


where the point source of heat contributes an amount A and the continuous source 
is represented by a(7)d7. 
The boundary conditions can the be used to determine A and a(7). 


Appendix E 
Complex Analysis 


It is impossible to explain Riemann’s contribution to the study of differential equa- 
tions without describing his approach to the theory of complex analysis, of which he 
is one of the three creators, alongside Cauchy and Weierstrass. But obviously there 
is no room here for a proper historical account of the emergence of key ideas on 
complex function theory. This chapter is therefore a series of glimpses into some of 
these ideas. 


E.1 Harmonic Functions 


We have seen that it is an elementary consequence of the Cauchy—Riemann equa- 
tionsthat the real and imaginary parts of a complex analytic function are harmonic 
functions. We shall now see that given the real part (say) of a complex function 
on a simply connected domain the imaginary part is determined up to a constant 
(we assume the necessary partial derivatives exist and are continuous). Suppose that 
we are given u(x, y) and required to find u(x, y) such that u(x, y) +iv(x, y) isa 
complex analytic function. Then we may solve the Cauchy—Riemann equations 


Ux = Vy and uy = —v,. 


For these equations to have a solution v it is necessary that u,, = —uyy, because the 
equations we are trying to solve say that these expressions are each equal to v,y, but 
this condition is met when the given function u is harmonic. 

We now solve the equation v, = —uy as follows. We can integrate both sides with 


respect to x to obtain 
v=— / uydx, 


which is single-valued because u is defined on a simply connected domain, so this 
determines v up to a constant. Then we check that vy = u, by differentiating under 
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vy = = f nya = frscas = Uy, 


as required. The function v is said to be the “harmonic conjugate” of the given 
function uw. 

By the time Riemann was writing his Ph.D. thesis in 1850, the basic properties of 
harmonic functions were well known. For example, the value of a harmonic function 
h(x, y) ata point (xo, yo) in its domain is the average of its values on a disc containing 
that point (this is little more than a statement of the Cauchy integral theorem, but it 
was known independently as a theorem in harmonic functions due to Gauss). Because 
an average value can never also be a largest (or a smallest) value unless all the values 
are the same, this means that a harmonic function takes its maximal value(s) on the 
boundary of its domain. This means that if a harmonic function is defined everywhere 
in the plane it tends to infinity as either x or y tends to infinity. In the context of 
complex analytic functions, this is Liouville’s theorem. Moreover, if two harmonic 
functions defined on the same domain have the same values on the boundary of the 
domain, they are equal. For, consider their difference. It is a harmonic function whose 
boundary values are zero, and by the earlier remark, this means that the difference 
of the two functions is zero everywhere in the domain, and so the two functions are 
equal. 


the integral sign: 


E.2._ Branch Points and Many-Valued “Functions” 


Throughout the nineteenth century many-valued “functions” such as the nth root 
function, the infinitely many-valued log function, the arcsine function, and others 
were considered as legitimate functions. Let us consider the square root “function”, 
which assigns to a non-zero complex number its two square roots 


n) 1/2 90/2 


ze 2, orre® or 1/2 ,i(0/2+m) _ __p1/2,i8/2 


and r 


and investigate what happens as the domain variable is moved along a circle around 
the origin. 

For a fixed value of 7, as 9 goes from 0 to 27 and z goes once the origin on a 
circle of radius r, the square root that was initially positive goes on a semicircle 
from r!/? to —r!/?, the negative square root, and the root that was initially negative 
goes on a semicircle from —r!/? to +r!/?, the positive square root. The two roots are 
interchanged in this process. 

Mathematicians of nineteenth century would say that the square root function is 
two-valued, and that it is branched at the point z = 0. Each single-valued determi- 
nation of the function, either of the square roots, is called a branch of the function, 
and it is necessary to define the domain of these functions carefully, as we discuss 
below. 
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The behaviour of the many-valued function z +> z° is not very different from that 
of z +> z!/?, We set z = re!® and observe that 


re? = rei O+2kn) | 


so the values of z® are all of the form 


Qa i(O0+2k7)a@ 


z~=re i(0a+2kra) = pelle e2kina 


=re 
and are obtained from each other by using a suitable multiple of e7“'"*. If a = p/q 
then the function takes qg distinct values, and if a is irrational then the function is 
infinitely many-valued. 

The same account holds for the function 


Zhe (z—a)*, 


which is branched at the point z = a. 
Now let h(z) be an analytic function, and suppose that its power series expansion 
in a neighbourhood of the origin is 


A(z) =ho thyzthoe? +---. 


Because each power of z in this expansion is integral we deduce that the function 
h(z) is single-valued in a neighbourhood of the origin. It follows that the function 


zr 2°h(z) 


takes as many values as does the function z  z°. The function (z — a)“h(z) is said 
to be branched at the point z = a; Riemann said more informally that it behaves like 
(z — a) near the point a. 

Conversely, if the function z +> f(z) is not single-valued near the origin one can 
look for a value of a such that the function 


ZR 2" f(z) 


is single-valued in a neighbourhood of the origin. 
For future reference, we note that if we differentiate the function (z — a)°h(z) 
we find 


= ((z — a)°h(z)) = a(z — a)°"h(z) + (z = a)*h'(z) = (z — a)" (ah(z) + (Z — a)h'(z)), 


so the derived function is branched like (z — a)°~! at z =a. 
As we shall see, we are free to consider a complex many-valued function that is 
branched at several points z = a,b, c,.... 
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On the other hand, if we stay away from the branch points, and let the domain 
variable vary only along curves that do not go round a branch point, we can recover 
a single-valued function, albeit on a restricted domain. This function will be the 
restriction of a branch of the many-valued function to the restricted domain. In the 
main, this is what Cauchy did in developing his theory of complex functions. 

For example, if we pick a value of the square root function at the point z = 4, say 
the value +2. Then near to z = 4 the value of the square root function that we shall 
choose is, of course, the value near to 2. We can proceed in this way by varying z and 
assigning the unique value to the square root function that makes our new function 
continuous, as long as we do not take z on a loop around the origin. One way to do 
this is to decide not to consider values of the square root function on the negative real 
axis, the points z for which z < 0. The domain of the square root function has been 
restricted to the plane of complex numbers with the negative real axis removed—this 
is commonly called a cut—and on that domain it is possible to choose a single-valued 
branch of the function. More complicated many-valued functions could similarly be 
restricted to other simply connected subsets of the plane of complex numbers and in 
this way studied, but at the price of introducing an element of arbitrariness into the 


theory. 
More precisely, on the domain {re’’ : —1 <0 < 7} we define the function 
re’ > re!%/?, The image of this function is the right-hand half-plane {pe'” : 


—m/2<p <7/2}. 

If, however, we cross the negative real axis, the value of the square root function 
would be multiplied by —1. More precisely, on the domain {re’” : —m < 0 < 7} we 
also define the function re’? +> —re!®/*, The image of this function is the left-hand 
half-plane {pe'? : 1/2 < yp < 31/2}. 

We are free to choose either branch in order to assign a value to the square root 
function on the cut, and it is natural to assign points re’” the value r!/*e'"/*. 

We shall refer to this later, so let us say in general that a many-valued function 
f is locally single valued if it is possible to choose a domain D, such as a disc, that 
contains no branch points of f, a value of f at a point zo in the domain D, and a 
single-valued function F on the domain D that agrees with the branch of f on D 
that takes the specified value at the point zo. 


E.3 Analytic Continuation 


We have seen that Gauss was clear about the distinction between a complex-valued 
function of a complex variable and a power series representation of that function. 
The rule of thumb for a power series representation is that it is valid on an open disc 
centred at a point P and of radius r > 0 that is determined by the fact that there 
is a point on the boundary of the disc where the function becomes infinite (more 
precisely, ceases to be defined and analytic). 

Thus the power series representation of the function (1 — z)~!, which is 
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Pt zt orto" 40: 


converges on the open disc centre the origin and radius 1, because the function is not 
defined at the point z = |. The same is true of the function (1 + z7)—!, because it is 
not defined at the points z = +i that lie on the boundary of the unit disc. 

Very often one obtains a function by first finding a power series representation of 
it.'° If, for example, one has the series 


Lt zt o4ete" $e: 


then one has a function defined on the open unit disc. But it is possible to obtain a 
power series representation for this function on a disc centred at the point z = — 5 by 
introducing the new variable z} = z + 5, and the radius of convergence of this power 
series is 3 /2—the distance from z = -5 to z = 1. The disc on which the new power 
series converges partly lies outside the unit disc, so the function defined by the power 
series can be extended to a larger domain. In this way, the function can be extended 
to its maximal domain, which will be the plane of complex numbers with the point 
z = 1 removed, because, of course, the power series are has been obtained from the 
function (1 — z)~!. This process of extending the domain of definition of a function 
is called analytic continuation. It can be applied to any function defined initially on 
some open disc by a convergent power series, and in the present context the examples 
we shall consider will produce many-valued functions defined everywhere except at 
a finite set of points. '7 

The basic facts about analytic continuation concern what happens if a function is 
continued in the way just described along a chain of discs, each one overlapping with 
the one before. For as long as this can be done, one says that the function is obtained 
by analytic continuation from the original disc. If you wish, you may suppose that 
the domain variable moves along a closed curve and at each point is the centre of a 
disc of possibly varying radius which is the disc of convergence of a power series 
representation of that function. But a question arises when a disc overlaps one much 
earlier in the chain of discs. In this situation, it can happen that the values of the 
function on the first and last discs are the same, or that they differ. They differ only 
if the chain goes round a branch point (which must not lie in any of the discs). The 
square root function is a case where something can go wrong. 

If, on the other hand, a function is extended analytically from the same initial disc 
along one chain to one disc, D,, and along another chain to another disc D2 and the 
extended functions agree in the intersection of D; and D2, then nothing can be said. 
It might be, for example, that the function is the square root function but one of the 
chains goes twice round the branch point at the origin. In this case both chains extend 
the function to the same value. 


'6This is because a good way of solving many a problem involving analytic functions is the method 
of undetermined coefficients. 


'7Many other things can happen, but that is not our subject. 
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However, if we add the condition that the first path can be deformed until it agrees 
with the second path and at no stage does the path as it is deformed pass over a branch 
point, then the result of analytic continuation along the chains will necessarily be the 
same. For the purposes of this chapter, let us call this result the deformation principle. 

The basic principles of analytic continuation were certainly known to Riemann, 
but it was Weierstrass who preferred to build a theory of analytic functions this way. 


E.4_ —_Liouville’s Theorem 


There are many ways of distinguishing between a complex analytic function and 
a map from R* to R*. We have already remarked that a complex analytic map is 
infinitely complex differentiable, and (as a result of theorems due to Cauchy) it is 
expressible locally as a convergent power series. As noted above, Riemann was the 
first to draw attention to a point known earlier to Gauss: at every point where a com- 
plex analytic function has a non-zero derivative it is conformal or angle-preserving. 
But one of the most surprising properties of a complex analytic function was discov- 
ered by Joseph Liouville; it was also known to Riemann, but it is not clear on what 
grounds he believed it. 

The theorem says that a complex analytic function that is bounded everywhere, 
even at infinity, is a constant. One proof uses the Cauchy integral theorem to estimate 
the coefficients in a power series expansion of the function and to show that all terms 
but the constant term are arbitrarily small and must therefore be zero. Another proof 
uses the fact, also proved by Riemann, that the real and imaginary parts of a complex 
function are harmonic, together with the fact that at any point the value of a harmonic 
function is the average of the values the function takes on a neighbourhood of the 
point. This means that a harmonic function can only take a maximum or minimum 
value on the boundary of its domain, so if that domain is the entire plane including 
the point at infinity a bounded harmonic function must be constant. 

The use of Liouville’s theorem is to show that if the quotient of two complex ana- 
lytic functions is bounded then the functions are complex multiples of each other. 
So for example, if a polynomial p(z) of degree n and the function | are such that 
the quotient 1/p(z) is bounded everywhere then p(z) is a constant. Consider now a 
non-constant polynomial with no zeros. It is therefore bounded away from zero, and 
so its quotient is bounded everywhere, and is therefore constant, which is a contra- 
diction. Therefore the polynomial p(z) must have a zero—the fundamental theorem 
of algebra. In the same way a function that has zeros at the points a), a2, ..., Gj» and 
becomes infinite like 1/z (simple poles) at the points b;, b2, ..., bj», and is otherwise 
neither zero nor infinite (including at infinity) must be a constant multiple of the 
rational function 

(Z — a1)(% — a2) +++ Z — Gm) 
@>b)G= bt C=— Bp) 


Appendix F 
Mobius Transformations 


MObius transformations are needed for a look at the work of Schwarz and Poincaré on 
the hypergeometric equation, which requires a modest amount of complex analysis. 


F.1 M6bius Transformations 


A Mobius transformation is a map of the complex plane to itself of the form 


az+b 
lee ay 
cz +d 
where a, b, c,d are complex numbers and ad — bc # 0. We say that the point z = 
—d/c goes to oo and that oo goes to a/c, and strictly speaking we should say that 
the map is of the extended complex plane to itself. 
A proper M6bius transformation is a map of the complex plane of the form 


az+b 


Zh : 
cz+d 


A proper Mobius transformation is obtained by following the map z+» Z with a 
Mobius transformation, and for most purposes it is enough to work with proper 
Mobius transformations. I shall drop the work “proper” when it can be inferred from 
the context. 

Mobius transformations have the convenient property that the transformations 


b k kb 
cls and jis ee 
cz +d kez +kd 


are the same, and so the inverse of the Mobius transformation 
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az+b 
b> 9 
cz+d 


is easy to write down: it is, as you should now check, 
dz+—b 
re ——_. 
—cz+a 


Show that if ~ is the Mobius transformation 


az+b 
> ’ 
cz+d 


and ji’ is the Mobius transformation 
a’ z + b’ 
Fe ry ee 
"ead z + d' 


then the Mobius transformation jp’ (performing jy’ first) is 


a’z +b" 
Fe ap a 
cz+d 


where a”, b”, c”, d" are given by the matrix product 


a” b” _ a b a’ b’ 
cl! d" a Cc d c d' . 


This allows us to use the convenient notation for the Mobius transformation 


az+b 
en 
cz+d 
zr> A(z), 
where A = (: ‘) . So, writing A’ the matrix for ju’ in the obvious way, the matrix 


for pp’ is AA’. 
The derivative of the Mobius transformation 


az+b 
cz+d’ 


f= 


° Pye See ad — bc 
a (ertar me rere 
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This means that the Mobius transformation is angle-preserving everywhere. 

Because a MGbius transformation is determined by the three ratios a:b: c:d it 
is easy to see that there is a unique Mobius transformation sending any three distinct 
points to any three distinct points. In particular, the Mobius transformation 


az+b 


ke. 
cz+d 


has this effect: 
Or b/d, 1h (a+b)/c+d), wraf/c. 


Show that the only Mobius transformations that map the set of three points 
{0, 1, oo} to itself are given by 


ze z,-,l—z, , , and 
Zz 


This group of transformations is important in the study of Riemann’s P-functions. 
The equation of the circle centre (a, b) and radius r is 


x? + y? — 2ax — 2by +c =0, 
where r* = a? + b? — c. Show that the equation can be written as 
z—az—az+c=0, 


where aw = a + ib and c = ad — r’. The equation of the circle can also be written 
in the suggestive forms 


az—C 
Z= 2 
Z—-a 
and 
z= Az, 
a—c lye os . ; c 
where A = ( 1-4 . Thus, for example, the unit circle, which has equation zz = 1, 
can be written in the form 
z= Az, 


01 
where A= (95) 


It is sometimes convenient to speak of the circle (a, c), which has radius r given 
by r? =aa-c. 
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F.2 Inversion in a Circle 


Inversion in a circle of radius r and centre O is the map that sends a point P to the 
point Q on the half line O P and such that O P.O Q = r?. In particular, inversion in 
the unit circle with centre at the origin can be written as 


zh 1/z= 


nile 


Show, by using a sequence of scalings and translations, that inversion in the circle 
centre a and radius r is the map 


r2 _ r2 az—C 
ZR -~+a=- SPSS =; 
Creme 8 4 Za Y Gets 8 | 


We shall be interested in the effect of inversion on lines and circles and on angles 
between lines and circles, and without loss of generality, we may assume we are 
inverting in the unit circle, zz = 1. 

Consider the locus defined by kzz — az — az +c = 0, where k = 0 or 1, which 
defines acircle whenk = | and astraight line when k = 0. Show that under inversion, 
this transforms to the locus kit - at - at +c = 0, which simplifies to k — az — 
az + czz = 0. This yields four cases: 


(1) k=1,c #0: a circle not through the origin goes to a circle not through the 


origin; 

(2) k= 1,c =0: a circle through the origin goes to a straight line not through the 
origin; 

(3) k =0,c #0: a straight line not through the origin maps to a circle through the 
origin; 


(4) k =0,c = 0: a straight line through the origin maps to itself. 


In each case, these statements need to be modified to take note of the fact that the 
origin has been deleted—we shall henceforth assume that this has been done. 

Notice that in case (1), which is the case of most interest, we may write the 
transformed equation as 


This makes it clear that the image circle has centre © and radius =. Deduce that the 
image of the centre of the original circle does not go to the centre of the transformed 
circle. 
We also need the concept of inversion in a straight line. A straight line has an 
equation of the form ax + by = c, which can be written as 
- Z —az+ 2c 
az + az = 2c, or z= ————_. 
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Reflection in this line is the map 
—az+2c 


Qa 


Gad a 


For example, the x-axis, y = 0, has the equation z — z = 0 (in this case a is purely 
imaginary), and reflection in it is given by z b> Z. 

Find the angle between two circles by applying the cosine rule to the triangle 
formed by their centres and the relevant point of intersection. If the circles are 
(a, c) and (a’, c’) with radii r and r’, respectively, show that the distance between 
their centres is |a — a’|? and that the cosine of the angle between their radii is 
aa’ + Gal — (c+ c’) 


/ 


2rr 
(a) Deduce that the circles are perpendicular if and only if 


aa’ +aai =c+c'. (E.1) 


(b) Deduce that a circle (a’, c’) is perpendicular to the unit circle if and only if 
c=. 


Consider the circles (a, c) and (a’, c’), which we assume intersect, with radii r 
and r’, respectively. Show that inversion is angle-preserving (up to sign, so strictly 
speaking one should say angle reversing) by showing that inversion in the unit circle 
sends them to the circles (a/c, 1/c) and (a’/c’, 1/c’), and that their radii are likewise 
transformed to r/c and r’/c’, respectively. Deduce that the angle between the trans- 
formed circles is the same as the angle between the original circles, and so inversion 
is angle preserving. 

Because inversion is a Mobius transformation, and Mobius transformations are 
angle-preserving, inversion in the circle (a, c) maps the circle (a’, c’) to itself if and 
only if the circles are at right angles. 

We now connect proper Mobius transformations and inversions by showing that 
every proper Mobius transformation is a product (not in a unique way) of two inver- 
sions. 

We already know that there is a unique Mobius transformation that maps the 
points a, b, c in that order to the points 0, 1, oo in that order. So it is enough to find a 
product of two inversions that has the same effect. Consider the inversion that maps 
a to 0 and c to oo given by 


_fa-c\_ [(k/a —k 
a=({ zl 1 a): 
It maps b to b’, say. Now we consider the inversion that maps (0, b’, 00) to (0, 1, co) 
given by z +> z/b’. The composite of these two has the required effect. 
Draw a picture of the map 


ZB ety. 


390 Appendix F: Mobius Transformations 


Hint: its fixed points are z = 0 and z = ow. 
Now draw a picture of the map 


’ 

ome: j Zr ey 
Z z, = em/3 : 

z— z—- 


Hint: what are the two fixed points of this map? 
Can you conjugate the second map into the first by a map that sends a to 0 and 3 
to 00? 


F.2.1_ Maps of the Unit Disc to Itself 


A map of the unit disc to itself necessarily maps the unit circle to the unit circle, so 
we restrict our attention to Mobius transformations. As we have seen, a circle has an 
equation of the form z = A(z). A Mobius transformation can be written as 


z’ = M(2), 
or z = M~!(z’). So the equation of the image circle under this Mébius transformation 
is 


M-!(z/) = AM7!(z’). 


So the circle is mapped to itself if 


MA !M-!=kA 


for an arbitrary non-zero complex number k. 
Applied to the unit circle, this says, in terms of the components of A, that 


a@=kd andb=ke. 


Therefore, the M6bius transformations mapping the unit circle to itself are of the 


form 
az+b 
i 


bze+a 


The real axis can be regarded as the circle with equation z = [z, where J is the 
identity matrix, and the same argument shows that a Mobius transformation maps 
the upper half-plane to itself if and only if its entries are all real and its determinant 
is positive. 
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F.2.1.1 Coaxial Circles 


Because a Mobius transformation can map any three distinct points to any three 
distinct points it can map any circle (or straight line) to any circle (or straight line). 
So a study of one-parameter family of circles will often apply to circles that are 
somehow tied to two points, as is the case with what are called coaxial circles. 

Coaxial circles come in two main families. The first family consists of all the 
circles (and straight lines) through two given points. We shall take the points to 
be z = +1, so the circles have their centres on the y-axis and a is purely imaginary 
(a + @ = 0). The condition that the circles pass through the points +1 forcesc = —1. 
So circles in the first coaxial family have equations of the form 


x4 y?—2%by-1=0, a=ib, c=. (F.2) 


The second family seems artificial at first sight. We start from the observation that 
if 
S=x?4+y’ —2ax —2by +c =0 


and 
Si =x? + y?—2a'x —2b’y+c' =0 


are the equations of two circles, then the equation 
AS + pS! = (A+ p(X? + y?) + 20a + pra’)x + 2(Ab + pb’)y + (Ac + pc’) = 0 
is also the equation of a circle. 

We now regard the points z = —1 andz = 1 as circles of zero radius, and consider 
what circles we get by the above routine. The equations of the point circles, as they 
are called, are 


S=x?+y°+2x+1=0 


and 
S=x?+y?—-2x4+1=0. 


So the family of circles that we obtain has equations of the form 
AS + pS’ = (A+ we? + y?2) +20 — wx + A+ p) =0, 


which we write in the form 


eee B 


2 2 
x +y°42 
4 A+ bE 


pens 
or, setting i, = a, as 
Lb 
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if 


x 4+y?4+2ax+1=0, a’ =a, c =1. (E3) 
If we now apply the rule for determining if two circles meet at right angles, (F.1), 
we find that : 
aa’ + aa’ = iba —iba =Oandc+c' = 0, 


so every member of the first coaxial family is perpendicular to every member of the 
second coaxial family. 

A striking version of this result is obtained by considering, as we may, all the 
circles and straight lines through the points z = 0 and z = oo. The first family of 
coaxial circles in this case consists of all lines through the origin, and the second 
of all the circles whose centres are at the origin. Because the first picture can be 
conjugated into the second, they are equivalent for the purposes of inversion. 

There is a third family of coaxial circles, which is obtained by starting with two 
coincident points, but we shall not need it. 


Appendix G 
Lipschitz and Picard 


In 1877 the German mathematician Rudolf Lipschitz published a paper [190], in 
which he observed that the question of the existence of solutions to a system of 
ordinary differential equations had been solved in the complex analytic case by 
Weierstrass [266] and Briot and Bouquet ([24], 49), and solutions shown to exist 
at least on some suitable domain. The method first obtained a formal power series 
that “solves” the equation, and then shows that on some domain around the initial 
point the series converges. However, there was no proof that a system of ordinary 
differential equations can be solved in the real case, and this was a gap he proposed 
to fill. 
The system of equations for functions y!, y*,... y” has the form 


d a 

<7 = f(x,y, y?,...¥") (@=1,2...0). 

dx 
Lipschitz assumed that the functions f° are defined and continuous on some domain 
G where they are bounded above by some given quantity. Furthermore, he assumed 
that 


FeO k= POG nel Se eal [ee ee =P. Ga) 
where the quantities c°’ are positive constants. The initial conditions are that when 
xX = xo y*° = yo, and the point (xo, Yas ..., Yg) lies (as we would say) in the interior 
of G, so that there are positive quantities a9, bf such that if the point (x, y',... y”) 
satisfies the inequalities 


|x — xo] S ao, ly — yol S 40, 


then it lies in G. 

Lipschitz was then able to show the existence of a domain H lying entirely in G 
such that there is a system of functions y!, y*, ... y” that satisfies initial conditions 
at xo and lies inside H for |x — xo9| < Ao for some Ao < ao. Lipschitz then divided 
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the interval [x9, x9 + Ao] into p equal pieces (the interval [xp — Ao, xo] is dealt with 
similarly). In each of these intervals he found quantities 77° such that 


nt — yo = f° (xo. Yo ++ YO) (xj; —XxXo) @=1,2...n 


and 
Natt ~My = i Catia) (Xat1 —Xa) A= 1,2...n, 


where a = 1,2... p — 1. The inequality (G.1) shows that all these points lie in H. 
Lipschitz then proved that as a finer and finer subdivision is produced, the variables 
tend to limits that establish the existence of solutions to the system of ordinary 
differential equations. 

The novelty of Lipschitz’s method is partly that it applies to systems of ordi- 
nary differential equations, and partly that inequality (G.1) is weaker than Cauchy’s 
conditions—it is, indeed, the famous Lipschitz condition. That said, it is clear that 
Lipschitz did not know of Cauchy’s earlier paper, which he never mentioned. 


G.1 Picard’s Method 


Let us first take a single first-order equation!® 


dy 

— =F, y), 

An (x, ) 
then, setting y = yo when x = xo, one can establish the fundamental existence theorem for 
this equation. To this end, consider the equations 


dy, 

— = F(x, : 
ay (x, yo) 
dy2 

— = F(x, y1), 
e (x, yi) 
dy 

7 = F(x, Yn—1)s 


effecting each quadrature!? in such a way that for x = xo one has y = yo. The problem is 
to prove that, as n —> 00 y, tends to a limit y which represents the desired integral provided 
that x remains in the neighborhood of xo. We assume that the function F(x, y) is continuous 
and defined for values of x and y between x9 — a and x9 + a on the one hand and yo — b 
and yo + b on the other; moreover, that one can determine a positive constant k such that 


|F(x, yo.) — F(x, yi) < kly2 — yal: 


we also assume that the function and the variables are real. 


'8This is a lightly corrected version of the translation in the Birkhoff Source Book, 250-251. 
'9Picard here chooses the constants of integration. 
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Let M be the maximum modulus of F(x, y)| when x and y remain between the indicated 
limits. One will have 


x1 

y= / F(x, yo)dx + yo. 
x0 

Let p be a quantity at most equal to a: y; will stay within the desired limits if Mp < b, and it 

is evident that the same will be true for yz, ..., y,. Letting 6 denote a quantity at most equal 

to p, we will suppose that x remains between x9 — 6 and xo + 6 We then have, on putting 

Un — Yn-1 = Zn 


dz 
— = F(x, yo), 
a (x, yo) 
dz 
ae F(x, y1) — F(x, yo), 
x 
dZn 


= F(x, yn-1) — F(x, yn-2), 
dx 
and all the z vanish atx = xg. Onehas|z)| < M6,|z2| < kKM62,|z3| < k2M&? and generally, 
lzn| < Mé(k6)""!. 


Hence, writing 


Yn = YotZ +22 +°++4+ Zn, 


one sees that y, tends to a limit if kd < 1. As a decreasing geometric progression, the series 


Yn = yYotZ+2Z2+++++2Zn,+-°- 


will be convergent. Thus y, converges to a limit y when x remains between xp — 6 and xo + 0, 
6 being the smallest of the quantities a, b/M, 1/k. In this interval, y evidently represents a 
continuous function of x. Thus one also has 


x 
Ln = / F(x, Yn—-1)dx + Yo. 
x0 


and, as y, and y,— tend to y, it follows that 


x 

y= / F(x, y)dx + yo, 
x0 

and hence dy/dx = F(x, y); that, is, the limit y satisfies the differential equation. Thus, the 

existence of the solution has been established. One can evidently employ the same type of 

proof if F is an analytic function of the complex variables z and w. 


Appendix H 
The Assessment 


H.1 Introduction 


In any historical or reflective essay it’s always good to push for more evidence and 
better arguments. If you want to claim that a book is important because it marks 
a significant advance, ask yourself: What advance? Why was that important? How 
does that book do it? When you have answered those questions, ask the next round 
of questions, such as: What was known before? Who said it was important? If, say, 
the book displays an improved use of the calculus then what was that improvement? 
Was it a new technique? An old technique in a new application? And so on. 

As for evidence, quotes always help, such as, in the case of the first essay, Euler’s 
or Clairaut’s comments on how difficult the Principia is to read, or Euler’s remark 
in his Mechanica to the effect that even a slight change from one problem to the next 
can produce great difficulties. 


H.2 Assessment 1 


Set at the end of week 3, to be handed in at the end of week 4, and returned to the 
students at the end of week 5. 


Question 1 Imagine you are British Professor of mathematics in about the year 1770 
who is recommending a good student to spend a year studying mathematics with 
either Euler or Lagrange. Explain to him or her: 


EITHER In what ways is Euler’s theory of mechanics an improvement on Newton’s 
Principia. 


OR What is involved in the study of partial differential equations. 
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Your answer should describe what has been taken to be important and why is the 
topic you select. 


This question is to be answered in not more than 500 words, which is a single side 
of A4 in 10-point print, one and a half line spaced. 

Please do not use a smaller font size. 

Leave room for me to scribble comments. 

I will NOT turn over the page—your answer must be on one side of a piece of A4. 
Contact me directly or by e-mail if you cannot comply with these requirements. 


Advice Think hard about what is the significance of what you report upon. In par- 
ticular, do not diminish Newton’s remarkable achievements in the Principia. 

When you are offering an opinion or judgement (as in “How important was...” or 
“why was ....”) give a brief argument in support of your opinion. 

Distinguish between contemporary criticisms of someone’s work and your own 
judgements. Don’t be afraid of offering your own judgement: you don’t have to 
understand everything you’ve read or heard, but do not say anything from the per- 
spective of the present day that could not have been said in 1770. Think about what 
was picked up, what was missing in the actual reception of these ideas. 

You do not have very many words, so anything you say about people must be 
essential—think of the mathematical styles of Euler and Lagrange, not their per- 
sonalities. 

No need to mention anything of a family or personal nature—imagine you’ve written 
all that good stuff elsewhere in the letter. 


You may find it helpful to write notes for the essay first. Try to use not more than 
150 words and observe the following rules: 

Each note should be one sentence long and should contain exactly one idea. 

The sentences should be organised in groups according to topic. 

The sentences, and the topics, should be arranged in a sensible order. 

When you have finished, you should be confident that you can write an essay of the 
required length in which the topics come in this order. 


H.3. Assessment 2 


EITHER Choose one of the following, and write an essay based on it that demon- 
strates some understanding of the mathematics, and situates the people and the ideas 
in a historical context. 

Spend roughly three pages describing the most important features of the text, and 
a page incorporating your analysis into an account of its importance when it was 
published. 
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e Cauchy: Note on the integration of first-order partial differential equations in any 
number of variables (see the translation in Sect. 31.1). 
e Thomson and Stokes on the telegraphist’s equation. 


OR write an article on the hypergeometric equation, covering the work of Gauss, 
Riemann, Schwarz, and Poincaré. 
OR write an article on the works of Schwarz mentioned in the course. 


[The course website had pdf copies of all of these essays. ] 


This question is to be answered in not more than 2200 words, which is four sides of 
A4 in 11-point print, one and a half line spaced, (or up to four and a half pages in 
TeX). 
Please do not use a smaller font size. 

A further page may be used for diagrams, and a modest number of extra words (I 
won’t count them). 
Bibliographical information should also be given at the end—I won’ t count the words. 
Leave room for me to scribble comments. 
Comments on these passages will be found below. 


Advice on Choosing an Extract 

In order to choose the text you intend to work on, I suggest that you read all the texts 
quickly over once and find (at least) one you want to proceed with. None of them are 
altogether easy to read. Let the obscure bits wash over you and wait for something 
more comprehensible to turn up. You may find that an offending paragraph has an 
easier second part, or that it is followed by an easier one. Try to form some sense of 
what the text is about. Terminology can be unclear. List the terms you don’t know 
the meaning of and e-mail me if there’s a problem. 

Once you have chosen a text or set of texts, a good general strategy would be to 
look quickly at each of its sections and write down what is claimed in each of them 
without at this stage worrying about how anything was proved. This will give you a 
skeleton to work on and enable you to see the general argument. 

Grapple with as much of one of these texts as you can. If parts are too hard, be 
sure that you need to understand and describe them—you may or may not. You may 
make a modest use of footnotes to alert me that you did not understand something. 

The essays by Thomson”? can be found in the Digital Mathematics Library and 
the Internet Archive on the web, which send you to Gallica, the site of the Biblio- 
théque Nationale in Paris. You may find these easier to read than a printed copy, but 
the graph in Thomson’s paper is not at all clear. The copy here is from the original 
publication in the Proceedings of the Royal Society for 1855.7! 


20Enter http://gallica.bnf.fr/ark:/12148/bpt6k95119q for Thomson. 
21 Enter http://rspl.royalsocietypublishing.org/content/7/382.full.pdf+html. 
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Advice on writing your essay The main point of this exercise is to put you in 
the situation of a mathematician (student or professional) who has just studied a 
mathematical topic (one whose history we are studying in this course) and for you 
to show that you understand the mathematics and its importance. 

For some of the extracts, you may well want to say that the author’s reasoning is odd, 
even wrong; if you do so, prove your claim. 

It is better to demonstrate a real understanding of a piece of the mathematics than a 
superficial understanding of all of the extract. 

Your secondary task is to say something historical about the extract and its author 
that uses your analysis of the extract to establish its importance in the context of its 
time. 

You may stray beyond the Lecture Notes and draw on information accessible in 
a good library (for example, the Dictionary of Scientific Biography and standard 
histories of mathematics such as Kline’s) or on the Web (keep a critical edge), but 
maximum marks are available for information entirely drawn from the Notes. 


A comment on How the Assignment Will Be Marked 
Tam looking for well-written essays, so an extra mark will be given for essays that are 
well organised and literate. So, for example, a coherent account that is mathematically 
correct and insightful but presented in ungrammatical prose will get one mark less 
than the same account in grammatical prose. If you want this mark, avoid English 
that is too conversational, flippant, or childish. Address to impress! 

If you want to get marks for good writing in your final essay but aren’t sure how, 
let me know. I can also look (but only quickly) at specific requests for help, brief 
outlines, and the like. 


H.3.1 Cauchy 


Two things are required here. 


e I want you to grapple with Cauchy’s argument and to compare it with Monge’s 
account and with a modern account, such as the one in the appendix that forms 
Chap. C. 

e I want you to explain the crucial difference between the (quasi)-linear case and 
the general case. 

e The hardest part of the paper is seeing how Cauchy handled the initial conditions. 
He succeeds, but with unfortunate notation and less clarity than one would like. 
You may find it easier to work back from the modern account (see Appendix C). 

e You may find it helpful to use the following partial differential equation as your 
worked example if you give one 


F(x, y,%, p,q) = xp’ + yg’ —2z=0, 
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with initial conditions that along the initial curve 


(x(0, 5), yO, 5), z(0, 5), p(O, 5), g(0,s)) =(+s,1—-s, s?, s,—s), 1/4<s <1. 


Or you may prefer to be clear what is involved in solving this equation, and use your 
knowledge to assess the ideas of Cauchy. Do not attempt to reach the complete 
solution in parametric form (x(s, t) =, etc.) but indicate the integrals that would 
have to be evaluated, and assuming that they have evaluated indicate how the initial 
conditions are used. 


Ideally, you will finish up understanding the subject of first-order partial differential 
equations much better as a result. 


H.3.2. Thomson 


These are some of the actual documents that led to the creation of the first successful 
trans-Atlantic telegraph, so they are genuinely applied mathematics, and I thought it 
might be interesting for you to work out some of the thinking behind it. So perhaps 
we want a greater sense of the difficulties and the uncertainties involved in the work 
as well of the mathematics that was used, and an indication of what was good about 
it. 


H.3.3 The Hypergeometric Equation 


I suggest that this essay has an introduction (where you write down the hypergeomet- 
ric equation and comment on its key properties) and a conclusion, and in between one 
page on what Gauss did and rather more on the contributions of Riemann, Schwarz, 
and Poincaré. 

For Gauss, state what he discovered about solutions of the hypergeometric equa- 
tion in the neighbourhood of the points z = 0, 1, oo, and explain the distinction he 
drew between a (possibly many-valued) function and its power series expansions. 

For Riemann, explain what a P-function is locally by definition, and what the key 
properties of such functions are. Compare this with the usual situation for solutions 
of a linear second-order ordinary differential equation, and indicate the key steps in 
Riemann’s argument that his function is the general solution of the hypergeometric 
equation studied by Gauss. (You can read Riemann’s paper if you are comfortable 
with the idea of analytic continuation around a branch point, or willing to become so. 
Ihave omitted Section 5 of the paper, which dealt with a technical point not needed 
for a sufficient appreciation.) 
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For Poincaré, use Schwarz’s work to explain how special cases of the hyper- 
geometric equation lead to triangles with particularly simple properties, and then 
indicate how non-Euclidean geometry naturally enters the story (you may take the 
case where the branch points have orders 2, 3, and 7—very attractive figures for this 
case are on the web, and see also Fig. 16.3). 


H.3.4 Schwarz 


His work is strikingly coherent and this essay calls for a detailed analysis of both 
the texts and their context. Your essay should discuss his alternating method, explain 
how it is connected to the so-called Schwarz—Christoffel theorem, and why mapping 
the half-plane to a triangle, to a square, and to a general quadrilateral (each with 
vertices specified in advance) are problems of increasing difficulty and, if possible, 
show how to solve them. 

[I note here that students can also read Green’s “An Essay on the application 
of mathematical analysis to the theories of electricity and magnetism”, which is 
reprinted in his Mathematical Papers (pp. 356-374), consult the Digital Mathematics 
Library and the Internet Archive on the web. There are two proofs (Sects. 4 and 5) 
where Green slips between mathematics and physics; it is good to bring them out, to 
make sense of Sect. 6, and then to compare his paper with the ideas of, for example, 
Gauss or Dirichlet.] 


H.4 Assessment 3 


The history of partial differential equations in the nineteenth century belongs to 
applied mathematics, not pure mathematics. To what extent do you agree, and why? 


H.4.1 Advice 


Claims like this can have several different kinds of answers. Clearly it depends on 
what is meant by such terms as pure and applied mathematics, about which there is 
legitimate disagreement. 

You might reply, for example, that the claim 


e is clearly true. 

e is clearly false. 

e is true in certain respects but not in others, say because the story is better seen as 
some mix of pure and applied. 
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e is only true because what happened is some mix of pure and applied mathematics, 
but it’s better seen as simply mathematics underneath. 

e is silly, perhaps because you don’t accept the terms of the debate, and find the terms 
pure and applied are superficial (they may mislead) and it’s all simply mathematics. 


Therefore, it is essential to have an opinion, to state it clearly at the start, to argue 
for it, to address such counter-arguments as seem to you to have merit, and to reach 
a substantial conclusion. 

To organise your thoughts, you should think about the aims, methods, and results 
of several mathematicians of the period, and about how they might have answered 
the question. Think also about the significant theoretical advances made in the study 
of partial differential equations in the nineteenth century, as well as the significant 
problems that were solved and applications that were made. 

Your essay should respond to the work of most of the leading mathematicians 
mentioned in the course. The nineteenth century is here defined to cover the period 
discussed from Lecture 10 to the end of the course. 

When you are ready to start writing, structure your essay so that it is easy to 
appreciate. 


e The Introduction should be very brief, but indicate the key points that you will 
make in your essay. In particular, state the opinion that you are going to argue for. 

e You should decide what you mean by such terms as pure mathematics, applied 
mathematics, geometry, and even physics. You may find it helpful to define what 
you mean by these terms early in your essay. 

e The body of your essay should be rich but clear. 

e Every paragraph must support your argument, even the ones where you are showing 
why your evidence doesn’t support a different opinion. 

e Be prepared to defend a complicated position (the first and second halves of the 
nineteenth century were different, for example, if you think they were) or an 
extreme position if you have one. 

e Your answer is a measure of the extent of your agreement or disagreement, it is 
not a string of facts about partial differential equations. 

e Facts about partial differential equations are grist to your agreement or disagree- 
ment and must be presented clearly in their own right and as supporting that 
agreement or disagreement. 

e If you are arguing for a complicated position, it might be a good idea to give a 
paragraph to each single aspect of it. 

e If you are arguing for a simple position (such as “clearly true’, “clearly false”, or 
“silly”) be sure you explain why and offer rebuttals of opposing positions. 

e The Conclusion should restate the position of the Introduction, but in a way that 
recalls subtleties and complications acknowledged in the body of the essay. 


You have to decide what is right for you, state that position, and defend it like a 
lawyer. Channel your favourite court room drama, call your expert witnesses. Give 
clear references to all the sources you use, so that they can be checked; for web-based 
sources give me the URL. 
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Full marks are available for essays that draw only on course material, but you are 
welcome to move beyond. 


H.4.2 How the Essays Will Be Marked 


To obtain a mark appropriate to a First: 

Quality of argument: The argument is convincing, supported with relevant facts, and 
well organised. 

The judgements reached are, when necessary, subtle, and balanced. 

The coverage is broad—there are no potentially damaging omissions. 

There are no unnecessary digressions. 

Ideally, and without being cranky, the essay is original (in its emphases, or its con- 
clusions). 

Good use of quotations. 

Accuracy: The historical facts are indeed correct and to the point. 

The mathematics is correct, clear, and relevant. 

The written English, as a piece of prose, is well written, and of the right length. 
And remember that extra mark for essays that are genuinely well written. 


Upper Second: 
Falls below the above in one or two significant ways. 


Lower Second: 
Falls below what is required for an Upper Second in one or two significant ways. An 
unconvincing or desultory argument. 


Third: 
Shows a bare knowledge of the topic, but is poorly organised and/or at times inac- 
curate. 


Fail: 
Does not demonstrate a knowledge of the topic. 


Borderline distinctions It can be hard to tell a First from a good Upper Second. 
Roughly speaking, a First-class essay is something you (and I!) should be proud of 
and you (or I) could put with confidence in front of anyone, whereas varieties of 
Second go to good, and even very good, work. On any borderline, a good original 
point can push you up, a significant error can push you down. At the other end, a 
Third says “Yes, you know some things you didn’t know before but only just enough” 
and a Lower Second says that you are either generally, if intangibly, better than that, 
or at some identifiable point clearly better than that. A Fail mark goes to an essay 
that doesn’t do more than recycle facts but lacks coherence or an argument, or fails 
to address the question of importance. 


Plagiarism The taking of information and arguments from sources you do not 
acknowledge is theft. It will result in a Fail. 
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